Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078

On 11/08/06, Jesper Juhl <[email protected]> wrote:

On 11/08/06, Nathan Scott <[email protected]> wrote:
> On Thu, Aug 10, 2006 at 01:31:35PM +0200, Jesper Juhl wrote:
> > On 08/08/06, Nathan Scott <[email protected]> wrote:
> > ...
> > Ok, I booted the server with 2.6.18-rc4 + your patch. Things went well
> > for ~3 hours and then blew up - not in the same way though.
> >
> > The machine was under pretty heavy load recieving data via rsync when
> > the following happened :
> >
> > Filesystem "dm-51": XFS internal error xfs_trans_cancel at line 1138
> > of file fs/xfs/xfs_trans.c.  Caller 0xc0210e3f
> >  [<c0103a3c>] show_trace_log_lvl+0x152/0x165
> >  [<c0103a5e>] show_trace+0xf/0x13
> >  [<c0103b59>] dump_stack+0x15/0x19
> >  [<c0213474>] xfs_trans_cancel+0xcf/0xf8
> >  [<c0210e3f>] xfs_rename+0x64d/0x936
> >  [<c0226286>] xfs_vn_rename+0x48/0x9f
> >  [<c016584e>] vfs_rename_other+0x99/0xcb
> >  [<c0165a36>] vfs_rename+0x1b6/0x1eb
> >  [<c0165bda>] do_rename+0x16f/0x193
> >  [<c0165c45>] sys_renameat+0x47/0x73
>
> Thanks Jesper.  Hmm, lessee - this is a cancelled dirty rename
> transaction ... could be ondisk dir2 corruption (any chance this
> filesystem was affected by 2.6.17's endian bug?)

No. The machine in question never ran any 2.6.17.* kernels. Its old
kernel was 2.6.11.11 (UP), then I tried 2.6.18-rc3-git3 (SMP) as
previously reported, then I tried 2.6.18-rc4 + your XFS patch.

>, or something
> else entirely.  No I/O errors in the system log earlier or anything
> like that?
>
No I/O errors in the logs that I could find, no.

> > I was doing an lvmextend +xfs_resize of a different (XFS) filesystem
> > on the same server at roughly the same time. But I'm not sure if
> > that's related.
>
> That wont be related, no.
>
> > I'm currently running xfs_repair on the fs that blew up.
>
> OK, I'd be interested to see if that reported any directory (or
> other) issues.
>
It did not.

What happened was this (didn't save the output sorry, so the below is
from memory) ;
When I ran xfs_repair it first asked me to mount the filesystem to
replay the log, then unmount it again, then run xfs_repair again. I
did that. No errors during that mount or umount.
Then, when I ran xfs_repair again it ran through phases 1..n (spending
aproximately 1hour on this) without any messages saying that something
was wrong, so when it was done I tried mounting the fs again and it
said it did a mount of a clean fs.
It's been running fine since.

Btw, I have left the machine running with the 2.6.18-rc4 kernel and it
can keep running that for ~11hrs more (I'll ofcourse let you know if
errors show up during the night), then I have to reboot it back to the
2.6.11.11 kernel that it is stable with and it will need to run that
until ~wednesday next week before I can do further experiments.
So, if you want me to test any patches I'll need to recieve them
within the next 10 or so hours if I'm to have a chance to run with
them for a few hours tomorrow - otherwise testing will have to wait
for next week.

--
Jesper Juhl <[email protected]>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: "Jesper Juhl" <[email protected]>

References:
- 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: "Jesper Juhl" <[email protected]>
- Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: "Tony.Ho" <[email protected]>
- Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: Nathan Scott <[email protected]>
- Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: Nathan Scott <[email protected]>
- Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: Nathan Scott <[email protected]>
- Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
  - From: Nathan Scott <[email protected]>

Prev by Date: Re: kernel panic:BUG in cascade at kernel/timer.c
Next by Date: Re: [PATCH 00/12] ThinkPad embedded controller and hdaps drivers (version 2)
Previous by thread: Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
Next by thread: Re: 2.6.18-rc3-git3 - XFS - BUG: unable to handle kernel NULL pointer dereference at virtual address 00000078
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]