On 8/27/07, Lonni J Friedman <netllama@xxxxxxxxx> wrote: > On 8/27/07, Mike <azmr@xxxxxxxxxxxxx> wrote: > > On Mon, 27 Aug 2007, Lonni J Friedman wrote: > > > > > > > > The timer in /proc/interrupts was not changing, and neither was the > > > timestamp in /proc/schedstat. ugh. > > > > > > > Sorry for the confusion. Andy Green's posts clear this up and hopefully > > this will clarify somewhat too. > > > > For aeon's, well maybe centuries, ok really just decades the majority of > > OS'es that run on PC's used IRQ0 to update the real time clock. What this > > means is that 100 times a second (100 Hz) the kernel (in the case of > > linux) is interrupted and the real time clock is updated among other > > things then the kernel goes about it's business. To get better resolution > > the 100 Hz parameter is changeable at compile time to 100, 250 or 1000 (I > > believe). I think it defaulted to 1000Hz recently. In any case as Andy > > Green pointed out modern kernels have gone "tickless". This is a very > > good thing for laptops and saving power etc. > > > > >From what I have read HPET or High Precision Event Timer is what is used > > now (in the 2.6.21 or so kernel timeframe and beyond). HPET is a device > > that is part of the chipsets on modern motherboards. From what I've been > > told the HPET code is broken in some kernel versions (I mean hey it's > > brand new code, sure could be buggy). > > > > So to the original poster (Lonni J Friedman?) I'd suggest (as Andy > > mentioned) check to see what your clocksource is: > > > > cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > > > If it says 'hpet' reboot your machine and pass 'hpet=disable' to the > > kernal via grub. I don't have a machine that I can reboot at the moment, > > but IIRC at the grub prompt you press 'e' to edit the command before > > booting. From there I forget but google should help find the syntax. Or > > perhaps it's even self explanatory. > > > > In any case this should force your machine to keep time the old fashioned > > way which unless you're on a laptop should be just fine. And if this > > works you can edit grub.conf (I believe) and 'hpet=disable' to make this > > permanent. > > My system doesn't have HPET. Also, to clarify, the problem only > happens after some unknown period of uptime. The system had been up > for like 100 days prior to the first occurance, and was up for just 3 > days after the 2nd occurance. Its not always present. So clearly > something is very broken somewhere, and isn't going to be fixed just > by disabling hpet at boot. My best guess is flaky hardware. I'll be > running memtest86+ tonight. If that checks out, then I can only > assume that the CPU (Athlon64X2) is dying. Just to close the loop, this turned out to be a kernel bug. Booting with nohz=off completely eliminated the problem. This is a tickless kernel bug.