MSI K9N Neo: crash under heavy IDE read

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

I bought new Athlon46 mobo with AM2 socket and recently
I noticed that copying large amounts of data reliably
crashes 2.6.17.11 64-bit on it.

memtest runs ok on this machine overnight.
Machine is not overclocked.

Copying movies from SATA drive to PATA drive oopses
after few gigabytes transferred. Creating iso image
with mkisofs (done entirely on PATA drive, no SATA attached)
does the same.

After some testing I found ou that rw load crashes
machine rather fast, while read load usually runs for several
minutes before crash. Setting udma4 or udma3 instead of udma5
doesn't help. Pity I don't have my own SATA drive to run tests
with it, ran most of the tests on PATA drive.

rw loads crashed twice on the same instruction, movl (%rdx), %eax
in fs/mpage.o, and I tracked it down to corresponding C line:

general protection fault: 0000 [1] PREEMPT
CPU 0
Modules linked in: nls_koi8_r nls_cp866 snd_pcm_oss snd_mixer_oss snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore
 snd_page_alloc ehci_hcd usb_storage usbcore nfsd exportfs autofs4 ip_conntrack_irc ip_conntrack_ftp ip_conntrack
Pid: 3396, comm: sync Not tainted 2.6.17.11_64 #1
RIP: 0010:[<ffffffff8018ff5b>] <ffffffff8018ff5b>{__mpage_writepage+206}
RSP: 0000:ffff810029119b08  EFLAGS: 00010287
RAX: 0000000000000000 RBX: ffff8100011174d8 RCX: ffff81007f744440
RDX: ffff010004fccb50 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff810029119c58 R08: ffff810004fccaf0 R09: ffff810029119ea8
R10: 0000000000000000 R11: ffff810001117468 R12: ffff8100011174d8
R13: 0000000000000008 R14: ffffffff801fcef1 R15: ffff81004c85e6c0
FS:  0000000000000000(0000) GS:ffffffff806b4000(0000) knlGS:00000000f7e09ba0
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 000000000806a40f CR3: 0000000041824000 CR4: 00000000000006e0
Process sync (pid: 3396, threadinfo ffff810029118000, task ffff81007d8c4b20)
Stack: ffff810029119ea8 ffff810029119d64 ffff810029119d58 ffffffff801fce1f
       0000000900000008 ffff8100458e4848 ffff8100458e4700 0000000800000035
       ffff810029119bc8 ffff81007f744440
Call Trace: <ffffffff801fce1f>{fat_get_block+0} <ffffffff801fcef1>{fat_writepage+0}
       <ffffffff80190c8c>{mpage_writepages+561} <ffffffff801fcef1>{fat_writepage+0}
       <ffffffff801fce1f>{fat_get_block+0} <ffffffff801fceda>{fat_writepages+16}
       <ffffffff8015192a>{do_writepages+40} <ffffffff8018f1b0>{__writeback_single_inode+485}
       <ffffffff8018f65e>{sync_sb_inodes+473} <ffffffff8018f7f7>{sync_inodes_sb+142}
       <ffffffff8018f8c5>{__sync_inodes+158} <ffffffff8018f956>{sync_inodes+25}
       <ffffffff8016ee0a>{do_sync+26} <ffffffff8016ee57>{sys_sync+14}
       <ffffffff8011b12e>{ia32_sysret+0}

Code: 8b 02 a8 04 74 0a 0f 0b 68 12 ae 4c 80 c2 e9 01 8b 02 a8 20
RIP <ffffffff8018ff5b>{__mpage_writepage+206} RSP <ffff810029119b08>

objdump -d:
     170:       4c 89 c2                mov    %r8,%rdx
==>     173:    8b 02                   mov    (%rdx),%eax
     175:       a8 04                   test   $0x4,%al
     177:       74 0a                   je     183 <__mpage_writepage+0xde>
     179:       0f 0b                   ud2a

mpage.s:
        movq    $0, -232(%rbp)  #, boundary_block
        movq    $0, -224(%rbp)  #, boundary_bdev
        movq    %r8, %rdx       # head, bh
.L19:
==>     movl    (%rdx), %eax    #* bh, D.16458
/*
Corresponding part of mpage.c:
        if (page_has_buffers(page)) {
                struct buffer_head *head = page_buffers(page);
                struct buffer_head *bh = head;
                /* If they're all mapped and dirty, do it */
                page_block = 0;
                do {
asm("#just before movl (%rdx), %eax");
                        BUG_ON(buffer_locked(bh));
                        if (!buffer_mapped(bh)) {
*/
        testb   $4, %al #, D.16458
        je      .L20    #,


However I'm afraid it may be not useful, because read
load tests crash in random places. Reads are done by cat >/dev/null.
A few assorted traces (written down by hand):

unable to handle kernel paging request at ffffa5007fdff6c0
RIP: free_block+140

GP at reiserfs_releasepage+103

unable to handle kernel paging request at ffff81001a7289c0
RIP: ffff81001a7289c0 (so it jumped into nirvana...)
trace: sys_read+71 ia32_sysret+0
comm: cat

NULL deref 00000011 at __generic_file_aio_read+275

Hardware info:
AMD Athlon64 socket AM2 @ 1800MHz
RAM DDRII 2 Gb
lspci:
00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a1)
00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a2)
00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a2)
00:01.2 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a2)
00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1)
00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2)
00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1)
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)
00:05.1 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)
00:06.0 PCI bridge: nVidia Corporation MCP55 PCI bridge (rev a2)
00:06.1 Audio device: nVidia Corporation MCP55 High Definition Audio (rev a2)
00:08.0 Bridge: nVidia Corporation MCP55 Ethernet (rev a2)
00:0b.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2)
00:0c.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2)
00:0d.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2)
00:0e.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2)
00:0f.0 PCI bridge: nVidia Corporation MCP55 PCI Express bridge (rev a2)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
06:00.0 VGA compatible controller: ATI Technologies Inc RV370 [Sapphire X550 Silent]
06:00.1 Display controller: ATI Technologies Inc RV370 secondary [Sapphire X550 Silent]

hdparm (switched into udma3 by hand. usually it is in udma5):
/dev/hda:
 Model=WDC WD2500JB-55GVC0, FwRev=08.02D08, SerialNo=WD-WCAL78337950
 Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq }
 RawCHS=16383/16/63, TrkSize=57600, SectSize=600, ECCbytes=74
 BuffType=DualPortCache, BuffSize=8192kB, MaxMultSect=16, MultSect=16
 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=268435455
 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120}
 PIO modes:  pio0 pio1 pio2 pio3 pio4
 DMA modes:  mdma0 mdma1 mdma2
 UDMA modes: udma0 udma1 udma2 *udma3 udma4 udma5
 AdvancedPM=no WriteCache=enabled
 Drive conforms to: device does not report version:
 * signifies the current active mode

Compressed dmesg and lspci -vvvxxx attached.
--
vda

Attachment: dmesg.bz2
Description: BZip2 compressed data

Attachment: lspci-vvvxxx.bz2
Description: BZip2 compressed data


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux