Re: .17rc5 cfq slab corruption.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, May 30, 2006 at 03:17:28PM +0200, Jens Axboe wrote:
 > On Sat, May 27 2006, Dave Jones wrote:
 > > On Sat, May 27, 2006 at 09:07:24AM +0200, Jens Axboe wrote:
 > >  > On Fri, May 26 2006, Andrew Morton wrote:
 > >  > > Dave Jones <[email protected]> wrote:
 > >  > > >
 > >  > > > Was playing with googles new picasa toy, which hammered the disks
 > >  > > > hunting out every image file it could find, when this popped out:
 > >  > > > 
 > >  > > > Slab corruption: (Not tainted) start=ffff810012b998c8, len=168
 > >  > > > Redzone: 0x5a2cf071/0x5a2cf071.
 > >  > > > Last user: [<ffffffff8032c319>](cfq_free_io_context+0x2f/0x74)
 > >  > > > 090: 10 bd 28 1b 00 81 ff ff 6b 6b 6b 6b 6b 6b 6b 6b
 > >  > > > Prev obj: start=ffff810012b99808, len=168
 > >  > > > Redzone: 0x5a2cf071/0x5a2cf071.
 > >  > > > Last user: [<ffffffff8032c319>](cfq_free_io_context+0x2f/0x74)
 > >  > > > 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 > >  > > > 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 > >  > > > Next obj: start=ffff810012b99988, len=168
 > >  > > > Redzone: 0x5a2cf071/0x5a2cf071.
 > >  > > > Last user: [<ffffffff8032c319>](cfq_free_io_context+0x2f/0x74)
 > >  > > > 000: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 > >  > > > 010: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
 > >  > 
 > >  > Pretty baffling... cfq has been hammered pretty thoroughly over the
 > >  > last months and _nothing_ has shown up except some performance anomalies
 > >  > that are now fixed. Since daves case (at least) seems to be
 > >  > use-after-free, I'll see if I can reproduce with some contrived case.
 > >  > I'm asuming that picasa forks and exits a lot with submitted io in
 > >  > between than may not have finished at exit.
 > > 
 > > The second time I hit it, was actually during boot up.
 > 
 > Dave, do you have any io scheduler switching going on?

Here's something interesting (possibly unrelated).
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=193534

I added this patch to our devel kernel (based on 17rc5-git5 right now)

It's similar to the list_head debugging patch from -mm

--- linux-2.6.12/include/linux/list.h~	2005-08-08 15:34:50.000000000 -0400
+++ linux-2.6.12/include/linux/list.h	2005-08-08 15:35:22.000000000 -0400
@@ -5,7 +5,9 @@
 
 #include <linux/stddef.h>
 #include <linux/prefetch.h>
+#include <linux/kernel.h>
 #include <asm/system.h>
+#include <asm/bug.h>
 
 /*
  * These are non-NULL pointers that will result in page faults
@@ -52,6 +52,16 @@ static inline void __list_add(struct lis
 			      struct list_head *prev,
 			      struct list_head *next)
 {
+	if (next->prev != prev) {
+		printk("List corruption. next->prev should be %p, but was %p\n",
+				prev, next->prev);
+		BUG();
+	}
+	if (prev->next != next) {
+		printk("List corruption. prev->next should be %p, but was %p\n",
+				next, prev->next);
+		BUG();
+	}
 	next->prev = new;
 	new->next = next;
 	new->prev = prev;
@@ -162,6 +162,16 @@ static inline void __list_del(struct lis
  */
 static inline void list_del(struct list_head *entry)
 {
+	if (entry->prev->next != entry) {
+		printk("List corruption. prev->next should be %p, but was %p\n",
+				entry, entry->prev->next);
+		BUG();
+	}
+	if (entry->next->prev != entry) {
+		printk("List corruption. next->prev should be %p, but was %p\n",
+				entry, entry->next->prev);
+		BUG();
+	}
 	__list_del(entry->prev, entry->next);
 	entry->next = LIST_POISON1;
 	entry->prev = LIST_POISON2;


And then it turned up this:

List corruption. next->prev should be f74a5e2c, but was ea7ed31c
Pointing at cfq_set_request.

Now, *anything* could have corrupted that list, not necessarily cfq,
but it's something of a coincidence.

		Dave

-- 
http://www.codemonkey.org.uk
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux