PDA

View Full Version : CVS corruption - anybody else seen this?


Mike
03-06-2004, 04:42 PM
We're using CVS, mostly 1.11.13 on linux, but a few 1.10 windows users.

All access is via ssh or a local mount on a single machine - no NFS or SMB
is involved in the access to CVS.

However, it seems that whenever we tag the database, some arbitrary number
of database "*,v" files become corrupt. It's unclear right now whether
someone else has to access the database while the tagging is going on or
not...

The corrupt is always 16 bytes of garbage overlayed on the bytes of the
file.

We're tearing our hair out trying to figure out why this would be
happening.....

Any info appreciated.
Thanks,
-Mike

Corey
03-07-2004, 05:36 AM
"Mike" <mike@nowhere.net> wrote in message news:Xns94A4A9ED768C9vidguySpamNoBetaFree@140.99.99.130... We're using CVS, mostly 1.11.13 on linux, but a few 1.10 windows users. All access is via ssh or a local mount on a single machine - no NFS or SMB is involved in the access to CVS. However, it seems that whenever we tag the database, some arbitrary number of database "*,v" files become corrupt. It's unclear right now whether someone else has to access the database while the tagging is going on or not... The corrupt is always 16 bytes of garbage overlayed on the bytes of the file. We're tearing our hair out trying to figure out why this would be happening.....

I've seen something like this, but it wasn't with CVS specifically. The problem
I experienced was similar in nature. When checking in files into a CM system
some of the files would end up corrupted. A small amout of garbage was seeping
into the check-ed in streams. The problem turned out to be my LinkSys box of
all things. Once I updated the firmware on that piece of hardware, all of my
corruption problems went away. There were other clues that my network was
toast too. Every now and then I would get an RMI unmarshalling error or some
such thing. But it wasn't till much later that I finally figured out what was going on.

Hope this helps
--Corey Any info appreciated. Thanks, -Mike

Kaz Kylheku
03-09-2004, 11:28 AM
"Corey Brown" <corey@spectrumsoftware.net> wrote in message news:<ruF2c.33424$6e7.25014@bignews1.bellsouth.net>... "Mike" <mike@nowhere.net> wrote in message news:Xns94A4A9ED768C9vidguySpamNoBetaFree@140.99.99.130... We're using CVS, mostly 1.11.13 on linux, but a few 1.10 windows users. All access is via ssh or a local mount on a single machine - no NFS or SMB is involved in the access to CVS. However, it seems that whenever we tag the database, some arbitrary number of database "*,v" files become corrupt. It's unclear right now whether someone else has to access the database while the tagging is going on or not... The corrupt is always 16 bytes of garbage overlayed on the bytes of the file. We're tearing our hair out trying to figure out why this would be happening..... I've seen something like this, but it wasn't with CVS specifically.

[ snip story about bad network hardware ]

Interesting hypothesis. Note that they are using SSH, an encrypted
transport. The use of a cipher could explain why the corruption always
has a fixed length; it could be the the cipher and the way it is being
used, such that an error in a small number of bits obliterates an
entire block of 16 bytes.

One would think, though, that SSH has some decent checksumming against
allowing errors in or tampering with the ciphertext to make undetected
changes in the plaintext.

Here is an idea for the OP: in addition to using SSH, experiment with
the -z option of CVS to compress the stream.

Jorgen Grahn
03-09-2004, 01:26 PM
On 9 Mar 2004 11:28:18 -0800, Kaz Kylheku <kaz@ashi.footprints.net> wrote: "Corey Brown" <corey@spectrumsoftware.net> wrote in message news:<ruF2c.33424$6e7.25014@bignews1.bellsouth.net>... "Mike" <mike@nowhere.net> wrote in message news:Xns94A4A9ED768C9vidguySpamNoBetaFree@140.99.99.130... We're using CVS, mostly 1.11.13 on linux, but a few 1.10 windows users. All access is via ssh or a local mount on a single machine - no NFS or SMB is involved in the access to CVS. However, it seems that whenever we tag the database, some arbitrary number of database "*,v" files become corrupt. It's unclear right now whether someone else has to access the database while the tagging is going on or not... The corrupt is always 16 bytes of garbage overlayed on the bytes of the file. We're tearing our hair out trying to figure out why this would be happening..... I've seen something like this, but it wasn't with CVS specifically. [ snip story about bad network hardware ] Interesting hypothesis. Note that they are using SSH, an encrypted transport. The use of a cipher could explain why the corruption always has a fixed length; it could be the the cipher and the way it is being used, such that an error in a small number of bits obliterates an entire block of 16 bytes. One would think, though, that SSH has some decent checksumming against allowing errors in or tampering with the ciphertext to make undetected changes in the plaintext. Here is an idea for the OP: in addition to using SSH, experiment with the -z option of CVS to compress the stream.

All this sounds very familiar; wasn't there semi-recently a bug in either
zlib, ssh, or the way a specific minor revision of CVS used them, which
caused things like this?

It might just be my imagination though, or maybe I'm thinking of a problem
discussed on the groff mailing list (cvs up over pserver tended to fail for
certain files, unless compression was turned off). If not, I'm sure someone
else remembers the details.

/Jorgen

--
// Jorgen Grahn <jgrahn@ ''If All Men Were Brothers,
\X/ algonet.se> Would You Let One Marry Your Sister?''

Zenin
03-12-2004, 04:17 PM
Mike <mike@nowhere.net> wrote: We're using CVS, mostly 1.11.13 on linux, but a few 1.10 windows users. All access is via ssh or a local mount on a single machine - no NFS or SMB is involved in the access to CVS. However, it seems that whenever we tag the database, some arbitrary number of database "*,v" files become corrupt. It's unclear right now whether someone else has to access the database while the tagging is going on or not... The corrupt is always 16 bytes of garbage overlayed on the bytes of the file. We're tearing our hair out trying to figure out why this would be happening.....

The garbage, is it 16 bytes of nulls or random?

About a year ago our Linux based CVS server had similar issues. We
tracked it down to any large files and then to effectively any process
that used a lot of RAM (like CVS diffing large files, or doing much of
anything with them) would see this corruption. We could even take a
large file, make a copy, verify that copy with MD5, and then watch
diff(1) freak out as it randomly found these 16 byte nulls throughout
each that weren't really there. It was a memory issue...

I forget the memory usage required to trigger this, but it was pretty
big.

Our first assumption was that it was a hardware problem, probably RAM,
so we swapped that out, no change. Swapped a bunch of other things
until we gave up and tried the repo on an entirely different box (same
version of Redhat though). Same problem. It was the OS, not the
hardware, arg...

We decided we didn't care enough to go looking for patches from Redhat
for this and dumped those two Linux boxes as fast as we could. -I
already have a million reasons to not run Linux, this was simply a
million and 1. Our repo runs on a Sparc now, Solaris 9, and it's never
been happier. I'd also run it on FreeBSD in a heartbeat too, but a lone
FreeBSD box in a Solaris shop makes no sense.

-Zenin

William Tracy
04-03-2004, 01:14 AM
Zenin wrote: Mike <mike@nowhere.net> wrote:We're using CVS, mostly 1.11.13 on linux, but a few 1.10 windows users.All access is via ssh or a local mount on a single machine - no NFS or SMBis involved in the access to CVS.However, it seems that whenever we tag the database, some arbitrary numberof database "*,v" files become corrupt. It's unclear right now whethersomeone else has to access the database while the tagging is going on ornot...The corrupt is always 16 bytes of garbage overlayed on the bytes of thefile.We're tearing our hair out trying to figure out why this would behappening..... The garbage, is it 16 bytes of nulls or random? About a year ago our Linux based CVS server had similar issues. We tracked it down to any large files and then to effectively any process that used a lot of RAM (like CVS diffing large files, or doing much of anything with them) would see this corruption. We could even take a large file, make a copy, verify that copy with MD5, and then watch diff(1) freak out as it randomly found these 16 byte nulls throughout each that weren't really there. It was a memory issue... I forget the memory usage required to trigger this, but it was pretty big. Our first assumption was that it was a hardware problem, probably RAM, so we swapped that out, no change. Swapped a bunch of other things until we gave up and tried the repo on an entirely different box (same version of Redhat though). Same problem. It was the OS, not the hardware, arg... We decided we didn't care enough to go looking for patches from Redhat for this and dumped those two Linux boxes as fast as we could. -I already have a million reasons to not run Linux, this was simply a million and 1. Our repo runs on a Sparc now, Solaris 9, and it's never been happier. I'd also run it on FreeBSD in a heartbeat too, but a lone FreeBSD box in a Solaris shop makes no sense. -Zenin

A very odd release failure, happened while I was as Informix. After
having multiple releases tested in R&D - were corrupt when they came
back from the media creation department. The reason given, was that
there was a hardware problem/failure in the harddisk where the releases
were placed. It had occured during transfer.

The errors were a certain size (8, 16, or 32 bytes, don't recall which).
And these parts were compared with the original R&D build - those builds
had no such problem.

So, there maybe software issues, or hardware/driver issues.

Testing the various steps and setting up regression testing when the
fault has been determind, will assist in catching this error, next time...

Good luck, especially if this is not a consistent (not repeatable) problem.


MyLounge.com Site Map
Forum: Cars, Cell Phone, Database, Games, Home Improvement, IT, Music, School, Sports, Web Design, Web Server, Weight Loss

The MyLounge.com forum is intended for informational use only and should not be relied upon and is not a substitute for any advice. The information contained on MyLounge.com are opinions and suggestions of members and is not a representation of the opinions of MyLounge.com. MyLounge.com does not warrant or vouch for the accuracy, completeness or usefulness of any postings or the qualifications of any person responding. Please consult a expert or seek the services of an attorney in your area for more accuracy on your specific situation. Please note that our forums also serve as mirrors to Usenet newsgroups. Many posts you see on our forums are made by newsgroup users who may not be members of MyLounge.com Term of Service