Re: [00/17] Large Blocksize Support V3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christoph Hellwig <[email protected]> writes:

> On Thu, Apr 26, 2007 at 04:50:06PM +1000, Nick Piggin wrote:
>> Improving the buffer layer would be a good way. Of course, that is
>> a long and difficult task, so nobody wants to do it.
>
> It's also a stupid idea.  We got rid of the buffer layer because it's
> a complete pain in the ass, and now you want to reintroduce one that's
> even more complex, and most likely even slower than the elegant solution?

No. I'm really suggesting improving the translation from BIO's 
to the page cache.  A set of helper functions.

This patch is suggesting we move to a BSD like buffer cache, except
that everything is physically mapped.

My most practical suggestion is to have support code so that you can
do all of the locking (that I/O cares about) on the first page of a
page group in the page cache.  You don't need larger physical pages to
do that.

>> Well, for those architectures (and this would solve your large block
>> size and 16TB pagecache size without any core kernel changes), you
>> can manage 1<<order hardware ptes as a single Linux pte. There is
>> nothing that says you must implement PAGE_SIZE as a single TLB sized
>> page.
>
> Well, ppc64 can do that.  And guess what, it really painfull for a lot
> of workloads.  Think of a poor ps3 with 256 from which the broken hypervisor
> already takes a lot away and now every file in the pagecache takes
> 64k, every thread stack takes 64k, etc?  It's good to have variable
> sized objects in places where it makes sense, and the pagecache is
> definitively one of them.

Agreed the page cache is all about variable sized objects known as files!
You don't need to do anything extra.  The problem is only with building
I/O requests from what is there.

Iff we really the larger physical page size to support the hardware
then it makes sense to go down a path of larger pages.  But it doesn't.

There is also a more fundamental reasons this patch is silly.  It assumes
that there is a trivial mapping between filesystems (the primary target
of the page cache and blocks on disk).  Now I admit this is common but
there is no reason to supposed it is true, and this patch appears to
exacerbate things.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux