Hi, > +x86_64-fix-config_preempt.patch > > x86_64-fix-config_preempt.patch > x86_64: Fix CONFIG_PREEMPT Has this one been stress-tested? I've got the impression that things have become a lot worse. I've been seeing things like these: Mar 25 01:00:48 websrv2 REISERFS: panic (device dm-1): clm-6000: do_balance, fs generation has changed Mar 25 01:00:48 websrv2 Mar 25 01:00:48 websrv2 ----------- [cut here ] --------- [please bite here ] --------- Mar 25 01:00:48 websrv2 Kernel BUG at prints:362 Mar 25 01:00:48 websrv2 invalid operand: 0000 [1] PREEMPT Mar 25 01:00:48 websrv2 CPU 0 Mar 25 01:00:48 websrv2 Modules linked in: iptable_nat ipt_MARK iptable_mangle ipt_LOG ipt_multiport ipt_owner ipt_mark ipt_state ipt_REJECT iptable_filter ip_tables twofish serpent blowfish ext3 jbd reiser4 sha256 aes dm_crypt ip_conntrack_irc ip_conntrack_ftp ip_conntrack via_rhine 8139too crc32 Mar 25 01:00:48 websrv2 Pid: 25172, comm: rm Not tainted 2.6.12-rc1-cs1 Mar 25 01:00:48 websrv2 RIP: 0010:[<ffffffff801cfe13>] <ffffffff801cfe13>{reiserfs_panic+211} Mar 25 01:00:48 websrv2 RSP: 0018:ffff81001efe37b8 EFLAGS: 00010292 Mar 25 01:00:48 websrv2 RAX: 0000000000000059 RBX: ffffffff803fbcac RCX: 00000000c0000100 Mar 25 01:00:48 websrv2 RDX: 0000000000000000 RSI: ffff81007d0b31f0 RDI: 00000000ffffffff Mar 25 01:00:48 websrv2 RBP: ffff81004f960060 R08: ffff81001efe2000 R09: 0000000000000002 Mar 25 01:00:48 websrv2 R10: 00000000ffffffff R11: ffffffff80340ef0 R12: ffff81007f850230 Mar 25 01:00:48 websrv2 R13: ffff81007f850000 R14: 0000000000000000 R15: ffff81004f9565d0 Mar 25 01:00:48 websrv2 FS: 00002aaaaaabaae0(0000) GS:ffffffff805be800(0000) knlGS:0000000055563dc0 Mar 25 01:00:48 websrv2 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Mar 25 01:00:48 websrv2 CR2: 00002aaaaaaff008 CR3: 000000001ebbd000 CR4: 00000000000006e0 Mar 25 01:00:48 websrv2 Process rm (pid: 25172, threadinfo ffff81001efe2000, task ffff81007d0b31f0) Mar 25 01:00:48 websrv2 Stack: 0000003000000010 ffff81001efe38a8 ffff81001efe37d8 ffff81001c041530 Mar 25 01:00:48 websrv2 ffff81001efe39d8 ffffffff801d4e42 ffff81007e659a00 0000000000000063 Mar 25 01:00:48 websrv2 0000000000000063 0000000000000000 Mar 25 01:00:48 websrv2 Call Trace:<ffffffff801d4e42>{pathrelse_and_restore+66} <ffffffff8010efe6>{retint_kernel+46} Mar 25 01:00:48 websrv2 <ffffffff801bb847>{do_balance+39} <ffffffff801bd315>{do_balance+6901} Mar 25 01:00:48 websrv2 <ffffffff801cbd90>{unfix_nodes+128} <ffffffff801be15b>{do_balance+10555} Mar 25 01:00:48 websrv2 <ffffffff801d7bf9>{reiserfs_cut_from_item+1673} <ffffffff801bfcfa>{reiserfs_unlink+362} Mar 25 01:00:48 websrv2 <ffffffff801873ae>{vfs_unlink+462} <ffffffff801874f9>{sys_unlink+233} Mar 25 01:00:48 websrv2 <ffffffff8018a268>{sys_getdents+232} <ffffffff8010f221>{error_exit+0} Mar 25 01:00:48 websrv2 <ffffffff8010e906>{system_call+126} Mar 25 01:00:48 websrv2 Mar 25 01:00:48 websrv2 Code: 0f 0b b8 c1 3f 80 ff ff ff ff 6a 01 4d 85 ed 48 c7 c2 40 ba Mar 25 01:00:48 websrv2 RIP <ffffffff801cfe13>{reiserfs_panic+211} RSP <ffff81001efe37b8> or Mar 25 16:39:21 websrv2 VFS: brelse: Trying to free free buffer Mar 25 16:39:21 websrv2 Badness in __brelse at fs/buffer.c:1295 Mar 25 16:39:21 websrv2 Mar 25 16:39:21 websrv2 Call Trace:<ffffffff8017787f>{__find_get_block+479} <ffffffff8017a175>{__getblk+37} Mar 25 16:39:21 websrv2 <ffffffff801de3d5>{do_journal_end+2181} <ffffffff80147d70>{keventd_create_kthread+0} Mar 25 16:39:21 websrv2 <ffffffff801cbf50>{reiserfs_sync_fs+64} <ffffffff8017c0b3>{sync_supers+211} Mar 25 16:39:21 websrv2 <ffffffff8015a22a>{wb_kupdate+42} <ffffffff8015ae8f>{pdflush+399} Mar 25 16:39:21 websrv2 <ffffffff8015a200>{wb_kupdate+0} <ffffffff80147d70>{keventd_create_kthread+0} Mar 25 16:39:21 websrv2 <ffffffff8015ad00>{pdflush+0} <ffffffff80147d2d>{kthread+205} Mar 25 16:39:21 websrv2 <ffffffff8010f3d7>{child_rip+8} <ffffffff80147d70>{keventd_create_kthread+0} Mar 25 16:39:21 websrv2 <ffffffff80147c60>{kthread+0} <ffffffff8010f3cf>{child_rip+0} Fortunately the kernel locked up and there was no data corruption. I've got PREEMPT and PREEMPT_BKL enabled under UP. I just took a look at the change and found this: x86-64 does this (in entry.S): bt $9,EFLAGS-ARGOFFSET(%rsp) /* interrupts off? */ jnc retint_restore_args movl $PREEMPT_ACTIVE,threadinfo_preempt_count(%rcx) sti call schedule cli GET_THREAD_INFO(%rcx) movl $0,threadinfo_preempt_count(%rcx) jmp exit_intr while i386 does this: testl $IF_MASK,EFLAGS(%esp) # interrupts off (exception path) ? jz restore_all call preempt_schedule_irq jmp need_resched preempt_schedule_irq is not an i386 specific function and seems to take special care of BKL preemption and since reiserfs does use the BKL to do certain things I think this actually might be the problem...? I'm not saying that this fix is wrong (it is obviously the right fix) but it causes another problem to show up. Unfortunately I don't have a amd64 machine to play with, so can somebody please check this?
Attachment:
signature.asc
Description: This is a digitally signed message part
- Follow-Ups:
- Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2
- From: Andi Kleen <[email protected]>
- [PATCH] Fix preemption off of irq context on x86-64 with PREEMPT_BKL
- From: Christophe Saout <[email protected]>
- Re: x86-64 preemption fix from IRQ and BKL in 2.6.12-rc1-mm2
- References:
- 2.6.12-rc1-mm2
- From: Andrew Morton <[email protected]>
- 2.6.12-rc1-mm2
- Prev by Date: Re: megaraid driver (proposed patch)
- Next by Date: Re: [PATCH] driver core: Separate platform device name from platform device number
- Previous by thread: Re: [PATCH] make Documentation/oops-tracing.txt relevant to 2.6 [was Re: OOPS running "ls -l /sys/class/i2c-adapter/*"-- 2.6.12-rc1-mm2]
- Next by thread: [PATCH] Fix preemption off of irq context on x86-64 with PREEMPT_BKL
- Index(es):