On Thu, 2006-08-10 at 12:23 -0700, Andrew Morton wrote:
> On Thu, 10 Aug 2006 11:51:34 -0700
> Badari Pulavarty <[email protected]> wrote:
>
> > Andrew Morton wrote:
> > > Also, JBD is presently feeding into submit_bh() buffer_heads which span two
> > > machine pages, and some device drivers spit the dummy. It'd be better to
> > > fix that once, rather than twice..
> > >
> > Andrew,
> >
> > I looked at this few days ago. I am not sure how we end up having
> > multiple pages (especially,
> > why we end up having buffers with bh_size > pagesize) ? Do you know why ?
> >
>
> It's one or both of the jbd_kmalloc(bh->b_size) calls in
> fs/jbd/transaction.c. Here we're allocating data to attach to a bh which
> later gets fed into submit_bh().
>
> Problem is, with CONFIG_DEBUG_SLAB=y, the data which kmalloc() returns can
> be offset by 4 bytes due to redzoning.
>
> Example: if the fs is using a 1k blocksize and we have a 4k pagesize, the
> data coming back from kmalloc may have an address of 0xnnnnxc04, so the
> data which we later feed into submit_bh() will span two pages.
>
> A simple fix would be to replace kmalloc() with a call to alloc_page().
> We'd need to work out how much memory that will worst-case-waste. If "not
> much" then OK.
>
> If "quite a lot in the worst case" then we'd need something more elaborate.
Would some like this be okay:
#ifdef CONFIG_DEBUG_SLAB
return alloc_page(...
#else
return kmalloc(...
#endif
This keeps it simple, and should be still be efficient in the
non-DEBUG_SLAB case.
> I'd suggest that ext3 implement ext3-private slab caches of size 1024,
> 2048, 4096 and perhaps 8192. Those caches should be kmem_cache_create()d
> on-demand at mount-time. They should be created with appropriate slab
> options to defeat the redzoning. The transaction.c code should use the
> appropriate slab (based on b_size) rather than using kmalloc(). The
> up-to-four private slab caches should be destroyed on ext3 rmmod.
--
David Kleikamp
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]