Hugh Dickins wrote:
On Thu, 3 Nov 2005, Benjamin Herrenschmidt wrote:
Take a look at Andrew's educational comment on set_page_dirty_lock
in mm/page-writeback.c. You do have the list of pages you need to
page_cache_release, don't you? So it should be easy to dirty them.
Ok, so just passing 'write' to get_user_pages() is good enough; right ?
Not quite, I think: you need to pass 'write' to get_user_pages()
initially; but at the end, if it was indeed writing into user space,
you need to do the set_page_dirty_lock thing on each of the pages
before page_cache_release, just in case a race cleaned them before
the DMA completed. I think (I've never used it myself).
Unfortunately at least for our use set_page_dirty{_lock} has an
unfortunate feature that it schedules writeback immediately.
get_user_pages() through __follow_page() calls set_page_dirty() only
if pte_dirty() bit is not set (I have no idea why it just does not
set pte dirty bit instead of doing set_page_dirty(), but there must
be some reason, yes?), so under normal condition page is marked
dirty in the page's structure only if it was not marked dirty
on pte level before. This way page itself is marked dirty only after
somebody copies dirty bit from page tables to the page structure, which
can take a lot of time.
On other side when you do set_page_dirty(), page is in few seconds
written back to the disk - causing quite visible I/O load if you
perform get_user_pages()/set_page_dirty() when compared with
situation where you just mark PTE dirty after you are done with
page.
(for those interested, in the situation described above we are
doing get_user_pages() on the file mapped by MAP_SHARED to the
user's address space to get physical address of these pages,
then virtual machine monitor uses this physical address to fill
guest's pagetables, and later (once guest is done with page) we
mark page dirty and release page; performance difference between
set_page_dirty() and home grown ptep_set_dirty() is more than
visible... but AGP memory in question is probably not
backed up by some writeable file, so it does not make difference
here)
Petr
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]