Re: [PATCH] Check for compound pages in set_page_dirty()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 19 2007, Jens Axboe wrote:
> On Wed, Jul 18 2007, Hugh Dickins wrote:
> > On Wed, 18 Jul 2007, Jens Axboe wrote:
> > > 
> > > Since I had my hands dirty already...
> > 
> > Great, thanks.  (There's also such a test in fs/nfs/direct.c,
> > but let's not trouble Trond until we've settled what to do here.)
> > 
> > > 
> > > ---
> > > 
> > > [PATCH] Remove PageCompound() checks before calling set_page_dirty()
> > > 
> > > Pre commit 41d78ba55037468e6c86c53e3076d1a74841de39 it was illegal
> > > to call set_page_dirty() on a compound page, since it stored the
> > > destructor in the mapping field. But now it's ok, so remove the
> > > ugly PageCompound() checks from bio and direct-io.
> > > 
> > > Signed-off-by: Jens Axboe <[email protected]>
> > 
> > I was about to Ack that, now that I've found something or other in the
> > libhugetlb testsuite comes this way, even on page[1], without showing
> > any problem.
> > 
> > However, I have noticed a particular inefficiency arising: that
> > bio_check_pages_dirty test specifically avoids pages already
> > PageDirty; but hugetlbfs_set_page_dirty carefully redirects to
> > set the head page dirty: so tail pages of a hugetlb compound page
> > will tend never to be PageDirty, and keep on coming back this way.
> > 
> > Which led me to look up the origin of those PageCompound tests:
> > Author: Andrew Morton <[email protected]>
> > Date:   Sun Sep 21 01:42:22 2003 -0700
> > 
> >     [PATCH] Speed up direct-io hugetlbpage handling
> >     
> >     This patch short-circuits all the direct-io page dirtying logic for
> >     higher-order pages.  Without this, we pointlessly bounce BIOs up to keventd
> >     all the time.
> > 
> > diff --git a/fs/bio.c b/fs/bio.c
> > index d016523..2463163 100644
> > --- a/fs/bio.c
> > +++ b/fs/bio.c
> > @@ -532,6 +532,12 @@ void bio_unmap_user(struct bio *bio, int write_to_vm)
> >   * check that the pages are still dirty.   If so, fine.  If not, redirty them
> >   * in process context.
> >   *
> > + * We special-case compound pages here: normally this means reads into hugetlb
> > + * pages.  The logic in here doesn't really work right for compound pages
> > + * because the VM does not uniformly chase down the head page in all cases.
> > + * But dirtiness of compound pages is pretty meaningless anyway: the VM doesn't
> > + * handle them at all.  So we skip compound pages here at an early stage.
> > ...
> > 
> > It looks like I was wrong in thinking it was just trying to avoid 
> > the crash on page[1].mapping.  At the least, your patch needs also
> > to remove that paragraph of comment from Andrew.  But really, it
> > looks like those PageCompound tests should stay, unless you can
> > persuade Andrew to Ack their removal.
> > 
> > Except (now, how many times can I change my mind in the course of
> > one email?), hugetlbfs_set_page_dirty was specifically added by
> > Ken Chen to avoid losing data via /proc/sys/vm/drop_caches.  Yet
> > fs/bio.c is carefully avoiding going there when dirtying a hugepage.
> > How does this work?  Looks like those PageCompound tests need to go!
> 
> Hehe, that didn't really get us much further, did it? :-)
> 
> My opinion is that since the win is marginal at best, we want to remove
> such tests as it just clutters up the code. And it's definitely not
> obvious why the tests are there, since they are not commented at all.
> Since it's even confusing you, then we can't expect the more vm ignorant
> of us (which definitely includes me) to grasp it!
> 
> > I'm lost: I hope Andrew and Ken can sort it out for us.
> 
> Posting a revised version, still leaving nfs out of it (I'll ping Trond
> to do the same, if this goes in).

FWIW, I ran a hugepage+O_DIRECT read test, that will cause the direct-io
code to hit set_page_dirty() for a compound page - and it works fine for
me. The fio job file used was:

[global]
directory=/data1
bs=4k
direct=1
hugepage-size=4m
iomem=shmhuge

[iothread]
filename=testfile
size=128m
rw=read

Test box is a lowly x86.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux