Hi, thanks for the reply > I believe the kernel uses the "hlt" instruction to reduce power > consumption and system bus competition in its idle loop. For > some reason, in interrupt doesn't seem to wake up some CPUs properly, > so this option was added. Instead of simply hlt'ing with interrupts > enabled, the kernel just spins in a tight loop waiting for an interrupt > to happen. > > Try this: > > 1) Change the line "kernel.sysrq=0" to "kernel.sysrq=1" in the file > "/etc/sysctl.conf". > > 2) As root: > > # echo 1 > /proc/sys/kernel/sysrq > OK, did that. As I said it takes some time for the system to hang so when it happen I'll report. [...] > A) A system deadlock. Try Alt-SysRq-s and then Alt-Sysrq-b to safely > reboot your system. Try disabling some of your modules. > There isn't much running, in fact all the machines are just a beowulf cluster with minimal stuff on them. The modules running are Module Size Used by Not tainted autofs 13780 0 (autoclean) (unused) nfs 89912 0 (autoclean) lockd 60656 0 (autoclean) [nfs] sunrpc 90876 0 (autoclean) [nfs lockd] e100 58468 1 floppy 58908 0 (autoclean) sg 37580 0 (autoclean) (unused) scsi_mod 111528 1 (autoclean) [sg] microcode 5024 0 (autoclean) ext3 74148 2 jbd 56624 2 [ext3] I suppose I could rmmod floppy sg and scsi_mod, I have no idea why sg and scsi are running as the machine has only an IDE HD (no CD or burner or anything SCSIish on it). > B) Broken hardware. > > C) A buggy BIOS that is incorrectly handling interrupt assignments or > power management (try adding "apm=off" and "acpi=off" to the > kernel boot arguments). > hmm, If I try this and then boot without no-hlt, should it start if this is the problem? I have another machine to play with waiting for the first to hang. > D) Some other problem ;-) > Hope not! > Hope this gives you a direction to proceed. > Oh yea, thanks! GianPiero