Re: [rfc][patch] remove racy sync_page?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Neil Brown wrote:
On Tuesday May 30, [email protected] wrote:

Nick Piggin wrote:


For workloads where plugging helps (ie. lots of smaller, contiguous
requests going into the IO layer), the request pattern should be
pretty good without plugging these days, due to multiple page
readahead and writeback.


Can I please put in a vote for not thinking that every device is disk
drive?

I find plugging fairly important for raid5, particularly for write.

The more whole-stripe writes I can get, the better throughput I get.
So I tend to keep a raid5 array plugged while any requests are
arriving, and interpret 'plugged' to mean that incomplete stripes
don't get processed while full stripes (needing no pre-reading) do get
processed.

The only way "large requests" are going to replace plugging is they
are perfectly aligned, which I don't expect to ever see.

Fair enough, thanks for the input. I was more imagining that IO tends
to come down in decent chunks, but obviously that's still not sufficient
for some. OK.


As for your original problem.... I wonder if PG_locked is protecting
too much?  It protects against IO and it also protects against ->mapping
changes.  So if you want to ensure that ->mapping won't change, you
need to wait for any pending read request to finish, which seems a bit
dumb.

I don't think that is the problem. set_page_dirty_lock is really
unlikely to get held up on read IO: that'd mean there were two things
writing into that page at the same time.

>
Maybe we need a new bit: PG_maplocked.  You are only allowed to change
->mapping or ->index of you hold PG_locked and PG_maplocked, you are
not allowed to wait for PG_locked while holding PG_maplocked, and
you can read ->mapping or ->index while PG_locked or PG_maplocked are
held.
Think of PG_locked like a mutex and PG_maplocked like a spinlock (and
probably use bit_spinlock to get it).

Well the original problem is fixed by not doing the sync_page thing in
set_page_dirty_lock. Is there any advantage to having another bit?
Considering a) it will be very unlikely that a page is locked at the
same time one would like to dirty it; and b) that would seem to imply
adding extra atomic ops and barriers to reclaim and truncate (maybe
others).


Then set_page_dirty_lock would use PG_maplocked to get access to
->mapping, and then hold a reference on the address_space while
calling into balance_dirty_pages ... I wonder how you hold a reference
on an address space...

inode. Presumably PG_maplocked would pin it? I don't understand
why you've brought balance_dirty_pages into it, though.


There are presumably few pieces of code that change ->mapping.  Once
they all take PG_maplocked as well as PG_locked, you can start freeing
up other code to take PG_maplocked instead of PG_locked....

Does that make sense at all?  Do we have any spare page bits?

I'm sure it could be made to work, but I don't really see the point.
If someone really wanted to do it, I guess the right way to go is have
a PG_readin counterpart to PG_writeback (or even extend PG_writeback
to PG_io)...

--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com -
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux