PDA

View Full Version : possible faulty batch of P4's?


Vaclav Dvorak
12-18-2003, 01:46 PM
Hello,

I've had a very strange occurence with what seems to be a batch of
faulty CPU's. It seems very unlikely that this would be the case and not
being widely known, so I want to ask you if you can confirm it, or if
you have another explanation.

The computer has a Soltek SL-85DRV4-C mainboard, with a VIA P4X266E
chipset. The original CPU was an Intel Pentium 4 1.6GHz, the FSB was
running on 400MHz and there was one 256MB 266MHz DIMM. The computer was
running fine.

The problem started with a CPU upgrade. A new 2.8GHz Pentium 4 was
installed (thus the FSB was running on 533MHz), and WinXP started
crashing. To ensure the hard disk was backed up before doing any
experiments, I booted from a Linux CD and tarred the whole filesystem to
another machine's harddisk over the network. This is how I detected the
fault: the resulting archive was corrupted. I repeated the operation
many times and the corruption occured nearly always, sometimes there
were even kernel oops's. (Just to be completely clear on this: the
archive itself was corrupted, not (apparently) the files inside it. This
isn't a disk problem, or even a network problem, as I reproduced the
corruption even when compressing and decompression on the same machine.
I also tried a different (de)compressor, and a different kernel version,
to eliminate a possible software problem.)

I tried three different memory modules (the original 256MB/266MHz, a
256MB/333MHz and a 512MB/266MHz one), the behaviour was the same. Then I
tried the CPU in another mainboard (a new one, different brand, SiS
chipset) - same problem. So I presumed the CPU was faulty and replaced
it with the supplier. (I went to great lengths to verify that it was
indeed the CPU that was the problem, because this seemed very unlikely
according to my previous experience. I've seen faulty CPU's that simply
didn't boot or that crashed, but not one that silently corrupts data,
AFAICR.)

Unfortunately, the replaced CPU exhibits the same problem, and that's
what prompted me to write this post. It seems extremely unlikely that
the replacement is also faulty, and yet I don't see any other
explanation. Before I go and replace yet another CPU just to find that
the problem doesn't go away and the supplier starts getting suspicious,
I thought perhaps someone here could have some idea or information that
would be useful.

Some additional info. The CPU is not overclocked. Nothing is apparently
overheating. The exact model of the CPU is SL6QB, cpuid D1, stepping 9.
The numbers on the CPU box are: product code BX80532PE2800DSL6QB, MM#
850394, FPO# L335B036, version# C30172-002. I tried removing all
non-essential hardware (modem, NIC, IDE+USB controller, various drives).
I also tried swapping the power supply (a cheap 300W model) for a
different type (but also a cheap 300W model). Obviously, nothing helped,
otherwise you wouldn't be reading this. ;-)

Thanks for any advice!

Vaclav Dvorak

Scott
12-18-2003, 04:21 PM
"Vaclav Dvorak" <dvorakv@idas.cz> wrote in message
news:brt78n$158m$1@news.vol.cz... Hello,
snip
I thought perhaps someone here could have some idea or information that would be useful.
Snip
Thanks for any advice! Vaclav Dvorak

So you're getting data corruption when copying from/to the onboard IDE
controllers?

BIOS upgrades always worth a shot. Set BIOS fail safe settings and see if
that improves things.

Were both MB's running slower P4's? It could be a speed/MB related issue or
perhaps an overloaded PSU?

I don't think it would be the processor.

Scott A.

Never anonymous Bud
12-18-2003, 04:44 PM
While still snuggled in a 'spider hole', Vaclav Dvorak <dvorakv@idas.cz>
scribbled:
I also tried swapping the power supply (a cheap 300W model) for adifferent type (but also a cheap 300W model). Obviously, nothing helped,otherwise you wouldn't be reading this. ;-)

I'd try a BETTER PS before I'd worry about anything else.





To reply by email, remove the XYZ.

Lumber Cartel (tinlc) #2063. Spam this account at your own risk.

It's your SIG, say what you want to say....

Stacey
12-18-2003, 07:49 PM
Vaclav Dvorak wrote:
Hello, I've had a very strange occurence with what seems to be a batch of faulty CPU's. It seems very unlikely that this would be the case and not being widely known, so I want to ask you if you can confirm it, or if you have another explanation. The computer has a Soltek SL-85DRV4-C mainboard, with a VIA P4X266E chipset.

First sign of trouble..

I also tried swapping the power supply (a cheap 300W model) for a different type (but also a cheap 300W model).

Next place to look.. I doubt very seriously it's 2 bad chips.

--

Stacey

Vaclav Dvorak
12-19-2003, 12:12 PM
Thanks for the replies!

Scott wrote: So you're getting data corruption when copying from/to the onboard IDE controllers?

No, that's not the problem. The thing I did to test for the problem was
the following command:

tar cz /mnt/hda1/WINDOWS | gunzip > /dev/null

That means, I was compressing a lot of data, piping the output straight
into the decompressor, and discarding its output. The decompressor
reported a CRC error on the archive. So the problem is most definitely
NOT a hard-disk or controller issue: the corruption occured either
during the compression, or during decompression, or in the "pipe"
between them, which is entirely in memory, no harddisk was involved.

In fact, I think I could probably reproduce the problem by compressing
and decompressing /dev/urandom, i.e. a kernel-generated source of
pseudo-random data.
BIOS upgrades always worth a shot. Set BIOS fail safe settings and see if that improves things.

Hmm, Soltek does offer two versions of the BIOS for download at
http://www.soltek.com.tw/English/download/S85DRV4-C.htm. The older
version has no comment, the newer says "Fixed shutdown temperature",
which not a problem. Perhaps it would be worth a try.

Regarding fail-safe settings, I did my best to try these. The setup
didn't do anything when I pressed the key for loading fail-safe
defaults, but I manually set everything to values I thought safe
(basically, as slow as possible).
Were both MB's running slower P4's? It could be a speed/MB related issue or perhaps an overloaded PSU?

The other mainboard was a new one, it's supposed to support speed of
3.06GHz and more according to the manual. The Soltek is over a year old
and such speeds didn't even exist at the time it was made, but it is
supposed to support 533MHz FSB.

Regarding the PSU, that's what the other two people suggested, too. I
unplugged all the drives and cards that weren't necessary for the
testing: second HDD, second CDROM, ZIP, modem, SBLive sound card,
additional IDE controller; only things left were an AGP VGA card, a HDD
and a CDROM. So I doubt the PSU could be overloaded.

Anyway, my supplier is out of stock for more powerful PSU's currently,
so I couldn't try that. :-( Still, I've seen many P4 computers at
similar speeds with this exact PSU type working flawlessly.
I don't think it would be the processor.

Well, I find it hard to believe, too, but I thought that because the two
CPU's were from the same batch, perhaps it's slightly more likely than
any of the other parts, because they've been replaced with actually
different types.

Anyway, I won't be able to try anything new for a few weeks now (the PC
is running with the old CPU in the meantime). Still, any other ideas
would be appreciated.

Vaclav Dvorak


MyLounge.com Site Map
Forum: Cars, Cell Phone, Database, Games, Home Improvement, IT, Music, School, Sports, Web Design, Web Server, Weight Loss

The MyLounge.com forum is intended for informational use only and should not be relied upon and is not a substitute for any advice. The information contained on MyLounge.com are opinions and suggestions of members and is not a representation of the opinions of MyLounge.com. MyLounge.com does not warrant or vouch for the accuracy, completeness or usefulness of any postings or the qualifications of any person responding. Please consult a expert or seek the services of an attorney in your area for more accuracy on your specific situation. Please note that our forums also serve as mirrors to Usenet newsgroups. Many posts you see on our forums are made by newsgroup users who may not be members of MyLounge.com Term of Service