On 8/27/07, Mike <azmr@xxxxxxxxxxxxx> wrote: > On Mon, 27 Aug 2007, Lonni J Friedman wrote: > > > > > The timer in /proc/interrupts was not changing, and neither was the > > timestamp in /proc/schedstat. ugh. > > > > Sorry for the confusion. Andy Green's posts clear this up and hopefully > this will clarify somewhat too. > > For aeon's, well maybe centuries, ok really just decades the majority of > OS'es that run on PC's used IRQ0 to update the real time clock. What this > means is that 100 times a second (100 Hz) the kernel (in the case of > linux) is interrupted and the real time clock is updated among other > things then the kernel goes about it's business. To get better resolution > the 100 Hz parameter is changeable at compile time to 100, 250 or 1000 (I > believe). I think it defaulted to 1000Hz recently. In any case as Andy > Green pointed out modern kernels have gone "tickless". This is a very > good thing for laptops and saving power etc. > > >From what I have read HPET or High Precision Event Timer is what is used > now (in the 2.6.21 or so kernel timeframe and beyond). HPET is a device > that is part of the chipsets on modern motherboards. From what I've been > told the HPET code is broken in some kernel versions (I mean hey it's > brand new code, sure could be buggy). > > So to the original poster (Lonni J Friedman?) I'd suggest (as Andy > mentioned) check to see what your clocksource is: > > cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > If it says 'hpet' reboot your machine and pass 'hpet=disable' to the > kernal via grub. I don't have a machine that I can reboot at the moment, > but IIRC at the grub prompt you press 'e' to edit the command before > booting. From there I forget but google should help find the syntax. Or > perhaps it's even self explanatory. > > In any case this should force your machine to keep time the old fashioned > way which unless you're on a laptop should be just fine. And if this > works you can edit grub.conf (I believe) and 'hpet=disable' to make this > permanent. My system doesn't have HPET. Also, to clarify, the problem only happens after some unknown period of uptime. The system had been up for like 100 days prior to the first occurance, and was up for just 3 days after the 2nd occurance. Its not always present. So clearly something is very broken somewhere, and isn't going to be fixed just by disabling hpet at boot. My best guess is flaky hardware. I'll be running memtest86+ tonight. If that checks out, then I can only assume that the CPU (Athlon64X2) is dying.