Hi Jens! On 24 Apr 2007, at 14:32, Jens Axboe wrote:
On Tue, Apr 24 2007, Roland Kuhn wrote:Hi Jens! [I made a typo in the Cc: list so that lkml is only included as of now. Actually I copied the typo from you ;-) ]Well no, you started the typo, I merely propagated it and forgot to fixit up :-)
Actually, I copied it from your printk() ;-) (thinking helps...)
Sure. You might want to include NFS file access into your tests, since we've not triggered this with locally accessing the disks. BTW:How are you exporting the directory (what exports options) - how is it mounted by the client(s)? What chunksize is your raid6 using?And what are the nature of the files on the raid (huge, small, ?) and what are the client(s) doing? Just approximately, I know these thingscan be hard/difficult/impossible to specify.The files are 100-400MB in size and the client is merging them into a new file in the same directory using the ROOT library, which does in essence alternating sequences of _llseek(somewhere) read(n bytes) _llseek(somewhere+n) read(m bytes) ... and then _llseek(somewhere) rt_sigaction(ignore INT) write(n bytes) rt_sigaction(INT->DFL) time() _llseek(somewhere+n) ... where n is of the the order of 30kB. The input files are treated sequentially, not randomly.Ok, I'll see if I can reproduce it. No luck so far, I'm afraid.
Too bad.
BTW: the machine just stopped dead, no sign whatsoever on console or netconsole, so I rebooted with elevator=deadline (need to get some work done besides ;-) )Unfortunately expected, if we can race and lose an update to - >next_rq, we can race and corrupt some of the internal data structures as well. If you have the time and inclination, it would be interesting to see if youcan reproduce with some debugging options enabled: - Enable all preempt, spinlock and lockdep debugging measures - Possibly slab poisoning, although that may slow you down somewhat
Kernel compilation under way, will report back.
No idea, 'grep -i stack .config' gives no indication, but ISTR that 4k was made the default some time back?Are you using 4kb stacks?
Ciao, Roland -- TU Muenchen, Physik-Department E18, James-Franck-Str., 85748 Garching Telefon 089/289-12575; Telefax 089/289-12570 -- CERN office: 892-1-D23 phone: +41 22 7676540 mobile: +41 76 487 4482 -- Any society that would give up a little liberty to gain a little security will deserve neither and lose both. - Benjamin Franklin -----BEGIN GEEK CODE BLOCK----- Version: 3.12GS/CS/M/MU d-(++) s:+ a-> C+++ UL++++ P+++ L+++ E(+) W+ !N K- w--- M + !V Y+
PGP++ t+(++) 5 R+ tv-- b+ DI++ e+++>++++ h---- y+++ ------END GEEK CODE BLOCK------
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
Attachment:
PGP.sig
Description: This is a digitally signed message part
- Follow-Ups:
- Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- From: Jens Axboe <[email protected]>
- Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- References:
- Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- From: Roland Kuhn <[email protected]>
- Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- From: Jens Axboe <[email protected]>
- Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- Prev by Date: Re: cpufreq default governor
- Next by Date: How to walk through the tasklist ?
- Previous by thread: Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- Next by thread: Re: [OOPS] 2.6.21-rc6-git5 in cfq_dispatch_insert
- Index(es):