Joe Conway writes:
Upon upgrade to Fedora 13 and the 2.6.33 kernel, the problem has become better in that I have not seen a hang during the nightly crons. However, now I can freeze the kernel on demand by copying a large file via ssh (scp or rsync). Perhaps relevant is that those cron nightly jobs include several rsyncs, however they are all local and do not use ssh. I have already made a bugzilla report, but have not heard a word on that. Given that I can now reproduce the problem more-or-less on demand, can someone tell me how to debug this? I have no experience with kernel debugging, but am reasonably good with gdb on user-space apps.
Based on the assumption that you are experiencing a kernel crash, there are two ways of obtaining diagnostics:
1) A serial console. Going retro, getting a null-modem adapter, and connecting the serial port of the ailing machine with the serial port on another machine, adding the "console" parameter to the kernel boot prompt, and redirecting console messages through the serial port to another machine where you'll run minicom to capture the incoming serial port data. Hopefully, if your kernel is oopsing or crashing, you will be able to obtain a log a useful dump.
2) Using kexec-tools to set up a recovery kernel. Adding the crashkernel parameter to your kernel boot prompt, reserving 128MB of your RAM for a recovery kernel and a small boot image. When your running kernel crashes, the recovery kernel, installed by kexec-tools, is going to generate a kernel dump, which can then be grokked by crash to generate a dump.
In either case, you have a fair bit of RTFMing to do.And then after wasting all the time, you'll discover that you're not getting a useful crash in the first place.
Attachment:
pgpV1qycMtwlO.pgp
Description: PGP signature
-- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines