Linux Format forums Forum Index Linux Format forums
Help, discussion, magazine feedback and more
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Computer freezing - could this be a RAM problem?
Goto page Previous  1, 2, 3  Next
 
Post new topic   Reply to topic    Linux Format forums Forum Index -> Hardware
View previous topic :: View next topic  
Author Message
nelz
Site admin


Joined: Mon Apr 04, 2005 12:52 pm
Posts: 8364
Location: Warrington, UK

PostPosted: Sat Sep 16, 2006 1:21 pm    Post subject: Reply with quote

No one said it because we've HERD it before Very Happy
_________________
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
Back to top
View user's profile Send private message
spottedcat
LXF regular


Joined: Mon Oct 31, 2005 3:14 pm
Posts: 971
Location: UK

PostPosted: Sat Sep 16, 2006 2:00 pm    Post subject: Reply with quote

BAAAH! FLOCK off with the silly puns! Very Happy

Update: Swapped the two RAM modules around and reseated the graphics card, booted into Ubuntu which promptly froze solid within a couple of minutes and which didn't respond to Alt-SysRq-S/U/B either. Reset, booted up and it froze again within five minutes. Reset, booted up and now it's been running fine for two or three hours. I even did a system update (the machine has a nvidia card) successfully - hallelujia!

So - no response to Alt-SysRq-etc with two distros. You say, 'system has to be pretty broken for these to not work', nelz. Do you think my original instinct (dodgy memory) is likely? I won't buy replacements just yet - prices seem to be up at the moment. I think I'll try running the system on just one and then the other 500Mb module and see what happens.
Back to top
View user's profile Send private message
nelz
Site admin


Joined: Mon Apr 04, 2005 12:52 pm
Posts: 8364
Location: Warrington, UK

PostPosted: Sat Sep 16, 2006 2:10 pm    Post subject: Reply with quote

It's possible the kernels were built without support for it, what does
Code:
zgrep SYSRQ /proc/config.gz
report?

Dodgy RAM is unlikely to stop this working, only a locked up kernel or something stopping the kernel reading the keyboard.
_________________
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
Back to top
View user's profile Send private message
spottedcat
LXF regular


Joined: Mon Oct 31, 2005 3:14 pm
Posts: 971
Location: UK

PostPosted: Sat Sep 16, 2006 6:09 pm    Post subject: Reply with quote

nelz, this is getting curiouser and curiouser. Whatever, I may have garnered some useful information for that FAQ you haven't published yet. Smile First, though, what I have established is that both Mepis and Ubuntu will go down as if poleaxed with Alt-SysRq-S/U/B when they are not frozen. They're the ones that wouldn't respond to that key combination when they were. But it's not quite as simple as that. Excuse the length of this post, but I've gone methodically through each of the distros on this machine.

Fedora Core 5. Error message to 'zgrep SYSRQ /proc/config.gz' because there is no /proc/config.gz. If I do Alt-SysRq-S/U/B in the gnome desktop desktop it wants to do a screenshot. Or rather, Alt-SysRq-S - didn't get as far as U/B. I got 48 dialogue windows before I realised what was happening. Confused In KDE there is no response to Alt-SysRq-S/U/B in a normally working system.

Ubuntu. Same error message as Fedora because again no /proc/config.gz. Alt-SysRq-S in gnome desktop - same as Fedora, but if I do Alt-SysRq-S/U/B in the GDM login screen the system goes down for a reboot. (No KDE on this system).

Mepis 3.4. No /proc/config.gz again but it does go down for a reboot with Alt-SysRq-S/U/B. (KDE desktop, of course).

Mandriva 2006. 'zgrep SYSRQ /proc/config.gz' gives:

Code:
# CONFIG_RSBAC_SOFTMODE_SYSRQ is not set
CONFIG_MAGIC_SYSRQ=y


and responds to Alt-SysRq-S/U/B.

PCLinuxOS. The file /proc/config.gz exists but 'zgrep SYSRQ /proc/config.gz' gives no output. No response to to Alt-SysRq-S/U/B.

All in all a mixed bag. I'll be interested in your comments. Anyway, I'm getting a locked kernel with several distros. It has to be a hardware problem, surely?
Back to top
View user's profile Send private message
nelz
Site admin


Joined: Mon Apr 04, 2005 12:52 pm
Posts: 8364
Location: Warrington, UK

PostPosted: Sat Sep 16, 2006 8:13 pm    Post subject: Reply with quote

If /proc/config.gz doesn't exist, it means the kernel was built without the option to make its config available through /proc. If zgrep gives no putput, it means the file is there, but it MAGIC_SYSRQ is not enabled.

If Alt-SysRq works when the system is running normally but not when it's crashed, it sounds like a kernel crash, which is almost certainly down to fault hardware. Can you run with only one of your RAM sticks? If so, try one then the other, memtest is reckoned to be less useful with modern RAM, bad memory can pass, especially after a short test. If you're going to trust memtest, run it at least overnight.
_________________
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
Back to top
View user's profile Send private message
spottedcat
LXF regular


Joined: Mon Oct 31, 2005 3:14 pm
Posts: 971
Location: UK

PostPosted: Sat Sep 16, 2006 8:53 pm    Post subject: Reply with quote

nelz wrote:
Can you run with only one of your RAM sticks? If so, try one then the other


Yes, thanks. That's what I'll do. I ran the little experiment above on just one module. Thanks for the explanation too.

There's a (mildly amusing) addendum to this story. I bought the memory sticks in PCWorld. (Yes, I know - but by the time you've paid the courier charge and waited in for a delivery which may or may not turn up....) This was the conversation I had there.

Me. "How much are your budget range DDR400, PC3200 500Mb RAM modules?"

PCWorld salesman. "39.99 each, sir."

Me. "Out of interest, how much are your better quality branded ones?"

PCWorld salesman. "Let me see. Oh! They're only 31.99 each."

Me. "Can you explain that?"

PCWorld salesman. "Not really."

Me. "OK then. I'll have two of your branded ones."

Perhaps I should have paid extra for the budget range. Sad
Back to top
View user's profile Send private message
nelz
Site admin


Joined: Mon Apr 04, 2005 12:52 pm
Posts: 8364
Location: Warrington, UK

PostPosted: Sat Sep 16, 2006 10:58 pm    Post subject: Reply with quote

You asked a PC World salesman to explain something? Such optimism Smile
_________________
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
Back to top
View user's profile Send private message
nordle
LXF regular


Joined: Fri Apr 08, 2005 10:56 pm
Posts: 1500

PostPosted: Sun Sep 17, 2006 9:40 pm    Post subject: Reply with quote

It could be a temp issue, the mobo you mentioned has a passively cooled northbridge doesn't it??

Try running burnMMX for 30mins:
http://pages.sbcglobal.net/redelm/

Also, checkout this:
http://www.stresslinux.org/

A great distro for burn testing, test memory and stress the cpu. I use burnMMX and stress from this. Has the advantage of booting from live CD, so does not mount any drives, making a system lockup that bit safer.

Also, check the bios to make sure all the settings are "normal" ie not "turbo" etc
_________________
I think, therefore I compile
Back to top
View user's profile Send private message
spottedcat
LXF regular


Joined: Mon Oct 31, 2005 3:14 pm
Posts: 971
Location: UK

PostPosted: Sun Sep 17, 2006 10:00 pm    Post subject: Reply with quote

Thanks, nordle. I'll look at the links but it's unlikely to be a temperature issue. In Ubuntu and Mepis at least I've got CPU temperature monitoring applets, and the temperature goes nowhere near a problem level, partly because I've fitted a very efficient (and quiet) Zalman cooler which has a PWM controlled fan. Also I've chosen settings in the BIOS such that the CPU speed is halved when a certain temperature is reached and an audible alarm goes off. Can't remember the exact setting I've chosen but it's a conservative one.

Each time it's locked up it wasn't doing anything CPU-intensive at all. When I left it doing a clamav sweep I heard the CPU fan go to maximum, but it didn't lock up then. Only some time later while in Firefox.
Back to top
View user's profile Send private message
nordle
LXF regular


Joined: Fri Apr 08, 2005 10:56 pm
Posts: 1500

PostPosted: Sun Sep 17, 2006 10:08 pm    Post subject: Reply with quote

Certainly give stresslinux a go.

The temp issue, I was referring to the northbridge which is passively cooled (I think???), not the cpu.

That heatsink is a BEAST, I've got the same one Smile Took a while to fit, but its 610rpm with cpu @29c Smile

EDIT:
Is it worth double checking you've not got any loose screws touching the mobo to the case, or just checking clearences. It's possible something expands in the heat and then shorts.
Back to top
View user's profile Send private message
spottedcat
LXF regular


Joined: Mon Oct 31, 2005 3:14 pm
Posts: 971
Location: UK

PostPosted: Sun Sep 17, 2006 10:17 pm    Post subject: Reply with quote

nordle wrote:
Is it worth double checking you've not got any loose screws touching the mobo to the case, or just checking clearences. It's possible something expands in the heat and then shorts.


Thanks, I'll have a look around next time I open the case. I take your point about it possibly being a northbridge rather than a CPU temperature problem, but I've had the thing lock up on me only a couple of minutes after a cold boot (on more than one occasion), but then go happily for hours with no problem at all. Also - the fancy PSU I mentioned earlier in this thread is said to be good at reducing the ambient temperature inside the case. Well - that's what the blurb says. Smile

I'm in PCLinuxOS atm with the affected machine, so if you see this post it hasn't locked up yet. Confused Earlier today I was running it on one RAM module only and it froze on me. I've now got it running on the other module and so far.... (touch wood.)
Back to top
View user's profile Send private message
ollie
Moderator


Joined: Mon Jul 25, 2005 12:26 pm
Posts: 2749
Location: Bathurst NSW Australia

PostPosted: Tue Sep 19, 2006 3:08 am    Post subject: Reply with quote

I have seen problems like this on PCs running Windows and Linux - it nearly always comes down to RAM, power supply or CPU, usually in that order. As a system builder I have given up on cheap RAM - branded or unbranded. The only RAM I use now is Corsair - more expensive but I've only had one problem in three years with a Value Select SD-RAM stick for an old Compaq.

The quality of the PSU and CPU fan do suggest they are not at fault, which leaves your original guess RAM.
Back to top
View user's profile Send private message
jjmac
LXF regular


Joined: Fri Apr 08, 2005 2:32 am
Posts: 1996
Location: Sydney, Australia

PostPosted: Tue Sep 19, 2006 9:46 am    Post subject: Reply with quote

A google on 'Linux Mepis 6' has some interesting feed back. I was looking for the kernel but the site wouldn't say, not directly anyway. As i have heard that there are some persistant issues with the 2.6.17 kernel that have been obscue.

One post on a powersaving issue that was freezing a persons install after running for a while and with high cpu usage ...

http://www.linuxquestions.org/questions/showthread.php?t=475411

The poster is using a Intel Celeron 2.93 Ghz 478 pin cpu, so not sure if it really relates. And i expect it would have already been looked at. But thought i'd drop it in anyway.

http://www.linuxquestions.org/questions/showthread.php?t=483590

An almost interesting thread on Mepis freezes. Only one post, but recent. It describes what looks like a problem with AGP not providing a lock ? ...

Any how,

Good Luck ...

Have you tried backing the kernel down some incremental versions, if it is a '.17'. No idea what Mepis uses though.


jm
_________________
http://counter.li.org
#313537

The FVWM wm -=- www.fvwm.org -=-

Somebody stole my air guitar, It happened just the other day,
But it's ok, 'cause i've got a spare ...
Back to top
View user's profile Send private message
jdtate101
LXF regular


Joined: Sat May 28, 2005 10:49 am
Posts: 115
Location: Birmingham

PostPosted: Tue Sep 19, 2006 11:37 am    Post subject: Reply with quote

I had this on my server, which locked under 5 different linux distros. In my case it was a bug with the IRQ_BALANCER service not working correctly with my motherboard BIOS. The CPU would loose IRQ mappings to the disks during periods of low cpu activity, and the system would hard lock when any subsequent disk requests were made. Not sure this is going to help you, but have you updated to the latest BIOS for your M/B?? My case was only because of the dual core CPU ,but as your running a P4 then I guess it`s not going to be this bug!!

Best of luck with the bug hunting.
_________________
Ubuntu Edgy & Beryl on:

AMD X2 4800+
4GB Corsair TWINX RAM
1.2TB RAID0 SATA2 (3ware RAID)
2 x Seagate 400GB USB2
Dual Layer DVD-RW
Nvidia 7800GT
2 x Viewsonic VP201b TFT
Iomega Rev Internal
Back to top
View user's profile Send private message MSN Messenger
spottedcat
LXF regular


Joined: Mon Oct 31, 2005 3:14 pm
Posts: 971
Location: UK

PostPosted: Tue Sep 19, 2006 2:10 pm    Post subject: Reply with quote

Update: I tried running the system on each of the two RAM sticks singly, and with each I got a freeze-up. Sad So, hoping the explanation for this is that both are faulty, I replaced them this morning with a PC2700 DDR-333 500Mb module which, although it has a lower bus speed than recommended, had previously run OK with the same CPU in a different motherboard (long story), and has also been running fine in another machine for the last few weeks. So far I have reinstalled Mepis 6, downloaded 100MB+ of updates and done some general system tweaking with, so far, no lock up. I'm posting from there now. But early days yet - The fault is intermittent and can take hours or only minutes before it manifests. I'll run this RAM stick for at least a couple of weeks before making any conclusion.

ollie, thanks for the comments and thanks for the link. I'd thought about buying Crucial memory sticks if only because their memory advisor tool takes you through the minefield of ECC versus non-ECC, unbuffered/registered and CAS number, none of which I really understand. And I believe they are a quality brand, so if anyone thinks otherwise I would be glad to hear about it. Interestingly, a quick look on the Corsair UK links shows that from two suppliers the Corsair modules are less expensive than the Crucial ones. But I must double-check that I'm comparing like with like. One lesson I've learnt from all this is not to buy RAM from PCWorld - I was a fool to do so. I was lulled into a false sense of security by one of their budget sticks running fine in another machine.

jjmac, thanks for the links - interesting - but it wasn't just Mepis that was freezing. Ubuntu, PCLinuxOS and Fedora did so as well. What I didn't mention before was that the system logs (in Ubuntu at least) didn't give any clue. Just a load of normal log-type chatter until the time of the freeze-up.

jdtate101, thanks for the input. No I haven't updated the BIOS, although I had thought about it. As far as the CPU is concerned, I'm not sure whether it has a dual core or not - my technical knowledge is not up to this. Confused Much to my surprise Fedora installed the SMP kernel and the following is the relevant part of lshw (courtesy of Ubuntu - Fedora doesn't include lshw Evil or Very Mad):

Code:
cpu
          description: CPU
          product: Intel(R) Pentium(R) 4 CPU 3.00GHz
          vendor: Intel Corp.
          physical id: 4
          bus info: cpu@0
          version: 15.4.1
          serial: 0000-0F41-0000-0000-0000-0000
          slot: Socket 775
          size: 3GHz
          capacity: 4GHz
          width: 32 bits
          clock: 200MHz
          capabilities: fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx pni monitor ds_cpl cid xtpr
          configuration: id=0
        *-cache:0
             description: L1 cache
             physical id: a
             slot: Internal Cache
             size: 32KB
             capacity: 32KB
             capabilities: synchronous internal write-back
        *-cache:1
             description: L2 cache
             physical id: b
             slot: External Cache
             size: 1MB
             capacity: 1MB
             capabilities: synchronous external write-back
        *-logicalcpu:0
             description: Logical CPU
             physical id: 0.1
             width: 32 bits
             capabilities: logical
        *-logicalcpu:1
             description: Logical CPU
             physical id: 0.2
             width: 32 bits
             capabilities: logical


Or would you get that with hyperthreading technology rather than a dual core?
Back to top
View user's profile Send private message
View previous topic :: View next topic  
Display posts from previous:   
Post new topic   Reply to topic    Linux Format forums Forum Index -> Hardware All times are GMT
Goto page Previous  1, 2, 3  Next
Page 2 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
Linux Format forums topic RSS feed 


Powered by phpBB © 2001, 2005 phpBB Group


Copyright 2011 Future Publishing, all rights reserved.


Web hosting by UKFast