Andrew Scott wrote:
Good advice. The disk isn't currenlty mounted and I'm running badblocks on it in read-only mode writing the output to a file. Interesting side note: the output file has been created but no bad blocks show up in it yet -- does badblocks only write on exit to the output file? Otherwise, perhaps it's just the drive controller or the SCSI card that are throwing errors and the data is safe and sound (oh, I hope this is true).
Ahh, you're using scsi?
Can you post excerpts from your kernel logs?
Yup, what follows it the dmesg output from right after the disk bombed out on the last bzip2recover operation. Current logs show no new errors while badblocks is running.
sorry about the "paste as quotation" but that was the simplest option to get the error into this message.
which other logs would be useful?
-Andrew
hci_hcd 0000:00:07.2: UHCI Host Controller uhci_hcd 0000:00:07.2: irq 11, io base 0000ef80 uhci_hcd 0000:00:07.2: new USB bus registered, assigned bus number 1 hub 1-0:1.0: USB hub found hub 1-0:1.0: 2 ports detected EXT3 FS on hda2, internal journal device-mapper: 4.1.0-ioctl (2003-12-10) initialised: dm@xxxxxxxxxxxxxx cdrom: open failed. cdrom: open failed. Adding 522072k swap on /dev/hda5. Priority:-1 extents:1 kjournald starting. Commit interval 5 seconds EXT3 FS on hda1, internal journal EXT3-fs: mounted filesystem with ordered data mode. kjournald starting. Commit interval 5 seconds EXT3 FS on hda3, internal journal EXT3-fs: mounted filesystem with ordered data mode. parport0: PC-style at 0x378 [PCSPP] inserting floppy driver for 2.6.5-1.358 Floppy drive(s): fd0 is 1.44M FDC 0 is a National Semiconductor PC87306 Linux Tulip driver version 1.1.13 (May 11, 2002) tulip0: MII transceiver #1 config 3000 status 7829 advertising 01e1. divert: allocating divert_blk for eth0 eth0: Lite-On 82c168 PNIC rev 32 at 0x1284df00, 00:A0:CC:57:7D:1E, IRQ 10. divert: freeing divert_blk for eth0 ip_tables: (C) 2000-2002 Netfilter core team Linux Tulip driver version 1.1.13 (May 11, 2002) tulip0: MII transceiver #1 config 1000 status 782d advertising 01e1. divert: allocating divert_blk for eth0 eth0: Lite-On 82c168 PNIC rev 32 at 0x1284df00, 00:A0:CC:57:7D:1E, IRQ 10. ip_tables: (C) 2000-2002 Netfilter core team eth0: Setting full-duplex based on MII#1 link partner capability of 41e1. NET: Registered protocol family 10 Disabled Privacy Extensions on device 022db720(lo) IPv6 over IPv4 tunneling driver divert: not allocating divert_blk for non-ethernet device sit0 eth0: no IPv6 routers present input: AT Translated Set 2 keyboard on isa0060/serio0 kjournald starting. Commit interval 5 seconds EXT3 FS on hdb1, internal journal EXT3-fs: mounted filesystem with ordered data mode. found reiserfs format "3.6" with standard journal reiserfs: using ordered data mode Reiserfs journal params: device sda1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 reiserfs: checking transaction log (sda1) for (sda1) Using r5 hash to sort names scsi0:0:0:0: Attempting to queue an ABORT message CDB: 0x28 0x0 0x1 0x1b 0x1b 0x17 0x0 0x0 0x38 0x0 scsi0: At time of recovery, card was not pausedscsi0: Dumping Card State while idle, at SEQADDR 0x8Dump Card State Begins <<<<<<<<<<<<<<<<<
Card was paused
ACCUM = 0xed, SINDEX = 0x64, DINDEX = 0x65, ARG_2 = 0x0
HCNT = 0x0 SCBPTR = 0x0
SCSISIGI[0x0] ERROR[0x0] SCSIBUSL[0x0] LASTPHASE[0x1] SCSISEQ[0x12] SBLKCTL[0x2] SCSIRATE[0x0] SEQCTL[0x10] SEQ_FLAGS[0xc0] SSTAT0[0x5] SSTAT1[0xa] SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xa4] SXFRCTL0[0x80] DFCNTRL[0x0] DFSTATUS[0x2d] STACK: 0x0 0x150 0x191 0x3
SCB count = 8
Kernel NEXTQSCB = 3
Card NEXTQSCB = 3
QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 0:1 1:2 2:7 3:0 QOUTFIFO entries: Sequencer Free SCB List: 4 5 6 7 8 9 10 11 12 13 14 15 Sequencer SCB Info: 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x1] 1 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x2] 2 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x7] 3 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x0] 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] Pending list: 1 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 2 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 7 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 0 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] Kernel Free SCB list: 6 5 4 DevQ(0:0:0): 0 waiting
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> (scsi0:A:0:0): Device is disconnected, re-queuing SCB Recovery code sleeping (scsi0:A:0:0): Abort Tag Message Sent Recovery code awake Timer Expired aic7xxx_abort returns 0x2003 scsi0:0:0:0: Attempting to queue an ABORT message CDB: 0x28 0x0 0x1 0x1b 0x1b 0x4f 0x0 0x1 0x0 0x0 scsi0: At time of recovery, card was not pausedscsi0: Dumping Card State in Message-out phase, at SEQADDR 0x156Dump Card State Begins <<<<<<<<<<<<<<<<<
Card was paused
ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xc0, ARG_2 = 0x0
HCNT = 0x0 SCBPTR = 0x3
SCSISIGI[0xa4] ERROR[0x0] SCSIBUSL[0xd] LASTPHASE[0xa0] SCSISEQ[0x12] SBLKCTL[0x2] SCSIRATE[0xc8] SEQCTL[0x10] SEQ_FLAGS[0x40] SSTAT0[0x5] SSTAT1[0x2] SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xac] SXFRCTL0[0x88] DFCNTRL[0x4] DFSTATUS[0x6d] STACK: 0xcb 0x0 0x150 0x191
SCB count = 8
Kernel NEXTQSCB = 3
Card NEXTQSCB = 3
QINFIFO entries: Waiting Queue entries: Disconnected Queue entries: 0:1 1:2 2:7 QOUTFIFO entries: Sequencer Free SCB List: 4 5 6 7 8 9 10 11 12 13 14 15 Sequencer SCB Info: 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x1] 1 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x2] 2 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x7] 3 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x0] 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] Pending list: 1 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 2 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 7 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] Kernel Free SCB list: 6 5 4 DevQ(0:0:0): 0 waiting
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> (scsi0:A:0:0): Device is disconnected, re-queuing SCB Recovery code sleeping Recovery code awake Timer Expired aic7xxx_abort returns 0x2003 scsi0:0:0:0: Attempting to queue an ABORT message CDB: 0x2a 0x0 0x2 0x65 0x48 0x47 0x0 0x0 0x28 0x0 scsi0: At time of recovery, card was not pausedscsi0: Dumping Card State in Message-out phase, at SEQADDR 0x156Dump Card State Begins <<<<<<<<<<<<<<<<<
Card was paused
ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xc0, ARG_2 = 0x0
HCNT = 0x0 SCBPTR = 0x3
SCSISIGI[0xa4] ERROR[0x0] SCSIBUSL[0xd] LASTPHASE[0xa0] SCSISEQ[0x12] SBLKCTL[0x2] SCSIRATE[0xc8] SEQCTL[0x10] SEQ_FLAGS[0x40] SSTAT0[0x5] SSTAT1[0x2] SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xac] SXFRCTL0[0x88] DFCNTRL[0x4] DFSTATUS[0x6d] STACK: 0xcb 0x0 0x150 0x191
SCB count = 8
Kernel NEXTQSCB = 3
Card NEXTQSCB = 7
QINFIFO entries: 7 Waiting Queue entries: Disconnected Queue entries: 0:1 1:2 QOUTFIFO entries: Sequencer Free SCB List: 2 4 5 6 7 8 9 10 11 12 13 14 15 Sequencer SCB Info: 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x1] 1 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x2] 2 SCB_CONTROL[0x0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 3 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x0] 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] Pending list: 1 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 2 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 7 SCB_CONTROL[0x74] SCB_SCSIID[0x7] SCB_LUN[0x0] 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] Kernel Free SCB list: 6 5 4 DevQ(0:0:0): 0 waiting
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>> Recovery SCB completes (scsi0:A:0:0): Device is disconnected, re-queuing SCB Recovery code sleeping Recovery code awake Timer Expired aic7xxx_abort returns 0x2003 scsi0:0:0:0: Attempting to queue an ABORT message CDB: 0x2a 0x0 0x2 0x65 0x48 0xef 0x0 0x0 0x40 0x0 scsi0: At time of recovery, card was not pausedscsi0: Dumping Card State in Message-out phase, at SEQADDR 0x156Dump Card State Begins <<<<<<<<<<<<<<<<<
Card was paused
ACCUM = 0xa0, SINDEX = 0x61, DINDEX = 0xc0, ARG_2 = 0x0
HCNT = 0x0 SCBPTR = 0x3
SCSISIGI[0xa4] ERROR[0x0] SCSIBUSL[0xd] LASTPHASE[0xa0] SCSISEQ[0x12] SBLKCTL[0x2] SCSIRATE[0xc8] SEQCTL[0x10] SEQ_FLAGS[0x40] SSTAT0[0x5] SSTAT1[0x2] SSTAT2[0x0] SSTAT3[0x0] SIMODE0[0x0] SIMODE1[0xac] SXFRCTL0[0x88] DFCNTRL[0x4] DFSTATUS[0x6d] STACK: 0xcb 0x0 0x150 0x191
SCB count = 8
Kernel NEXTQSCB = 3
Card NEXTQSCB = 2
QINFIFO entries: 2 Waiting Queue entries: Disconnected Queue entries: 0:1 QOUTFIFO entries: Sequencer Free SCB List: 1 2 4 5 6 7 8 9 10 11 12 13 14 15 Sequencer SCB Info: 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x1] 1 SCB_CONTROL[0x0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 2 SCB_CONTROL[0x0] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0xff] 3 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] SCB_TAG[0x0] 4 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 5 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 6 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 7 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 8 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 9 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 10 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 11 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 12 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 13 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 14 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] 15 SCB_CONTROL[0x0] SCB_SCSIID[0xff] SCB_LUN[0xff] SCB_TAG[0xff] Pending list: 1 SCB_CONTROL[0x60] SCB_SCSIID[0x7] SCB_LUN[0x0] 2 SCB_CONTROL[0x74] SCB_SCSIID[0x7] SCB_LUN[0x0] 0 SCB_CONTROL[0x64] SCB_SCSIID[0x7] SCB_LUN[0x0] Kernel Free SCB list: 7 6 5 4 DevQ(0:0:0): 0 waiting
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Recovery SCB completes
(scsi0:A:0:0): Device is disconnected, re-queuing SCB
Recovery code sleeping
Recovery code awake
Timer Expired
aic7xxx_abort returns 0x2003
scsi0:0:0:0: Attempting to queue a TARGET RESET message
CDB: 0x28 0x0 0x1 0x1b 0x1b 0x17 0x0 0x0 0x38 0x0
aic7xxx_dev_reset returns 0x2003
Recovery SCB completes
Recovery SCB completes
scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0
scsi: Device offlined - not ready after error recovery: host 0 channel 0 id 0 lun 0
SCSI error : <0 0 0 0> return code = 0x10000
end_request: I/O error, dev sda, sector 18553623
scsi0 (0:0): rejecting I/O to offline device
scsi0 (0:0): rejecting I/O to offline device
scsi0 (0:0): rejecting I/O to offline device
scsi0 (0:0): rejecting I/O to offline device
scsi0 (0:0): rejecting I/O to offline device
scsi0 (0:0): rejecting I/O to offline device
SCSI error : <0 0 0 0> return code = 0x10000
end_request: I/O error, dev sda, sector 18553679
scsi0 (0:0): rejecting I/O to offline device
SCSI error : <0 0 0 0> return code = 0x10000
end_request: I/O error, dev sda, sector 40192071
scsi0 (0:0): rejecting I/O to offline device
SCSI error : <0 0 0 0> return code = 0x10000
end_request: I/O error, dev sda, sector 40192239
scsi0 (0:0): rejecting I/O to offline device
scsi0 (0:0): rejecting I/O to offline device
Buffer I/O error on device sda1, logical block 4483
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 4484
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 4485
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 4486
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 4487
lost page write due to I/O error on sda1
Buffer I/O error on device sda1, logical block 4488
lost page write due to I/O error on sda1
journal-601, buffer write failed
------------[ cut here ]------------
kernel BUG at fs/reiserfs/prints.c:338!
invalid operand: 0000 [#1]
CPU: 0
EIP: 0060:[<129a6bcd>] Not tainted
EFLAGS: 00010206 (2.6.5-1.358) EIP is at reiserfs_panic+0x21/0x4b [reiserfs]
eax: 00000024 ebx: 0dba2400 ecx: 00000000 edx: 0fd30f6c
esi: 0b430d10 edi: 0dba2400 ebp: 00000000 esp: 11e36d70
ds: 007b es: 007b ss: 0068
Process pdflush (pid: 9, threadinfo=11e36000 task=11e1a630)
Stack: 129b8637 129c2320 00001000 129b02ce 0dba2400 129b994d 06226000 0b430d10 00000001 0cdfd3dc 0b430110 0b430d10 0b430750 0dba2400 129b416f 00002000 00000000 0fb47000 1127b57c 11e36000 000003fa 0000f137 00000006 0b430d10 Call Trace:
[<129b02ce>] flush_commit_list+0x271/0x328 [reiserfs]
[<129b416f>] do_journal_end+0x983/0x9a2 [reiserfs]
[<0212ff39>] pdflush+0x0/0x1e
[<129b304c>] journal_end_sync+0x52/0x57 [reiserfs]
[<129a4513>] reiserfs_sync_fs+0x36/0x5d [reiserfs]
[<02145cf7>] sync_supers+0x6c/0x8e
[<0212f8cb>] wb_kupdate+0x2f/0xf4
[<0227ecc1>] schedule+0x3ed/0x44d
[<0212feb2>] __pdflush+0xbe/0x145
[<0212ff53>] pdflush+0x1a/0x1e
[<0212f89c>] wb_kupdate+0x0/0xf4
[<0212ff39>] pdflush+0x0/0x1e
[<02125265>] kthread+0x69/0x91
[<021251fc>] kthread+0x0/0x91
[<021041d9>] kernel_thread_helper+0x5/0xb
Code: 0f 0b 52 01 3d 86 9b 12 59 8d 93 2c 01 00 00 85 db 58 b8 52 <3>scsi0 (0:0): rejecting I/O to offline device
Buffer I/O error on device sda1, logical block 2319195
scsi0 (0:0): rejecting I/O to offline device
Buffer I/O error on device sda1, logical block 2319195
Yeah, run
hdparm -d0 /dev/drive
Excellent idea. I'll do this once badblocks finishes (looks like another hour). Though hdparm /dev/sda doesn't really return much along the lines of configurable options, I'll have to try this none-the-less. Thanks.
Sorry, this won't help with scsi... :(
and then:
dd bs=1 if=your-file-to-recover of=file-on-a-different-drive
Also, excellent idea. I was trying to read the filesystem blocksize of 4K. Totally stupid, I should go bit by bit!
Actually byte by byte...
I'll try these things next. Thanks for your thoughts. Excellent suggestions.
this will copy your file one byte at a time, creating more processing overhead which will slow you down.
Obviously, I don't know of any tools that rate limit file copying, except for maybe rsync, but I'm not sure about that either.
I emailed the guys at Namesys (reiserfs headquarters in Oakland, CA). They have a standing offer of "Ask any questions for $25". I sent them $25 and asked them a question. Hans Reiser got back to me as well as another employee, both with good suggestions. They suspected the hardware immediately. They made one really keen suggestion: if the bit count is identical on the original as the copy (when copied to another filsystem), but the md5sums are different, then try and run bindiff on the two files and use a binary editor to toggle the differing bits, with the goal of a correct md5sum match. I imagine this will the last thing
that's nice, but don't try that on the entire 2gb file, split it up first...
Good idea but how can I tell the split up files are actually good copies?
run cmp (compare) on the two files, it will tell you where they are different.
I try before sending the disk off for disk recovery.
Anyway, thanks a lot for your time and thoughts. What a pain in the ass.
Yep, anyone wonder why people like RAID?
Support your local student.
Use IDE, then people might consider that... ;)
Mike