On 08/17/2010 06:44 AM, Steve Blackwell wrote: > I leave my computer on 24/7 so that my backups can run at night. > Lately, it has been crashing during the night usually leaving no trace > of what happened. Last night it crashed but left this > in /var/log/messages: > > Aug 17 01:04:56 steve kernel: INFO: task kjournald:1960 blocked for more than 120 seconds. > Aug 17 01:04:56 steve kernel: "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Aug 17 01:04:56 steve kernel: kjournald D 00002743 0 1960 2 0x00000080 > Aug 17 01:04:56 steve kernel: cf98fd9c 00000046 ff2f442e 00002743 00032558 00000000 f15c756c cf82d400 > Aug 17 01:04:56 steve kernel: c0a5e6ac c0a63140 f15c756c c0a63140 c0a63140 cf98fd74 c05b61ef f1714e18 > Aug 17 01:04:56 steve kernel: 00000001 00000000 00002743 f15c72c0 b39690c0 1b48082c f6630a60 c2208140 > Aug 17 01:04:56 steve kernel: Call Trace: > Aug 17 01:04:56 steve kernel: [<c05b61ef>] ? cfq_may_queue+0x48/0xa8 > Aug 17 01:04:56 steve kernel: [<c0793ef7>] io_schedule+0x5f/0x98 > Aug 17 01:04:56 steve kernel: [<c05ac02f>] get_request_wait+0xc7/0x13c > Aug 17 01:04:56 steve kernel: [<c0454641>] ? autoremove_wake_function+0x0/0x34 > Aug 17 01:04:56 steve kernel: [<c05ac4a4>] __make_request+0x27f/0x386 > Aug 17 01:04:56 steve kernel: [<c04cebd4>] ? __slab_alloc+0x269/0x3f6 > Aug 17 01:04:56 steve kernel: [<c05ab011>] generic_make_request+0x286/0x2d0 > Aug 17 01:04:56 steve kernel: [<c04a77e5>] ? mempool_alloc_slab+0x13/0x15 > Aug 17 01:04:56 steve kernel: [<c04a78b1>] ? mempool_alloc+0x5c/0xf2 > Aug 17 01:04:56 steve kernel: [<c05ab122>] submit_bio+0xc7/0xe0 > Aug 17 01:04:56 steve kernel: [<c04fc9d3>] ? bio_alloc_bioset+0x2a/0xb9 > Aug 17 01:04:56 steve kernel: [<c04f9038>] submit_bh+0xf4/0x114 > Aug 17 01:04:56 steve kernel: [<c0562f74>] journal_commit_transaction+0x38b/0xcc7 > Aug 17 01:04:56 steve kernel: [<c044747a>] ? lock_timer_base+0x26/0x45 > Aug 17 01:04:56 steve kernel: [<c0447696>] ? try_to_del_timer_sync+0x5e/0x66 > Aug 17 01:04:56 steve kernel: [<c0565f1d>] kjournald+0xb8/0x1cc > Aug 17 01:04:56 steve kernel: [<c0454641>] ? autoremove_wake_function+0x0/0x34 > Aug 17 01:04:56 steve kernel: [<c0565e65>] ? kjournald+0x0/0x1cc > Aug 17 01:04:56 steve kernel: [<c0454409>] kthread+0x64/0x69 > Aug 17 01:04:56 steve kernel: [<c04543a5>] ? kthread+0x0/0x69 > Aug 17 01:04:56 steve kernel: [<c04041e7>] kernel_thread_helper+0x7/0x10 > > This happened in the middle of the backup which started at 1:00am and finished (successfully) at 1:28am so perhaps the backup blocked the kjournald process but it didn't crash the computer because there are later messages in the backup log and the messages file. > > The last entry in the messages file is: > > Aug 17 02:03:55 steve smartd[2347]: Device: /dev/sda [SAT], SMART Prefailure Attribute: 3 Spin_Up_Time changed from 167 to 168 > Aug 17 02:03:55 steve smartd[2347]: Device: /dev/sda [SAT], SMART Usage > Attribute: 194 Temperature_Celsius changed from 122 to 124 > > Could a hard drive get shut down because it was getting too hot? What would be a normal temp for a hard drive that has just completed a backup? 124C seems really hot. The HD cooling fan had been broken so I replaced it this past weekend but it doesn't seem to have helped. Too late? Permanent HD damage already done? > Any other comments or suggestions? > > Thanks > Steve > > Hi Steve, REPLACE THE DRIVE IMMEDIATELY!! Otherwise, you are courting disaster! See if it is still under warranty and ask manfacturer for RMA. -- users mailing list users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe or change subscription options: https://admin.fedoraproject.org/mailman/listinfo/users Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines