Re: Remapping pages mapped to userspace (was: [PATCH 10 of 20] ipath - support for userspace apps using core driver)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 16 Mar 2006, Roland Dreier wrote:

> Anyway, I'd like to hijack this thread slightly, since we are close to
> a subject that I've been thinking about lately, and I'd like to take
> advantage of the expert's attention...

I look over my shoulder, ah yes, you're speaking to Alan, who's given
you a good reply.  But he's looking at it from a bus/driver perspective,
which I can't comment on, so I'll return to your original mail.

> My mthca driver (drivers/infiniband/hw/mthca) supports mapping some
> MMIO registers into userspace via io_remap_pfn_range() in a .mmap
> method.  I think I have that pretty well under control.
> 
> However, on a hot unplug event, when the underlying PCI device is
> going away, I would like to replace that mapping with a mapping (with
> a mapping to the zero page?), so that userspace accesses after the
> device is gone don't explode.  What's the "right" way to do that?

Nobody has asked for that before (so far as I know; or maybe I'm just
not looking at your question from the right angle), so I don't have
any canned right answer.

Sounds a reasonable thing to ask for, especially from the hotplug world.
But it's not how we've proceeded until now: things stay open until the
last user goes away, only then can the driver go away.

Thus a file remains open, and its filesystem busily mounted, until
the last user unmaps or closes it.

You seem to be asking to "revoke" an mmap: not unreasonable, but
we've never provided a revoke method before, and I fear there
might be gotchas.  Perhaps there's a different, more familiar
model I can feel more comfortable with...

Shall we look at it, instead, as like truncating a file?  You could
use vmtruncate, but you'll have nothing cached, so maybe just use
unmap_mapping_range (see mm/memory.c for args), which is also exported.
Only one caller on the inode(mapping) at once, please (on a regular
file, inode->i_sem is held; but that won't quite work with a device).

So far as I can see, that should work nicely (even, fortuitously,
despite Linus' special adjustment of vm_pgoff on VM_PFNMAP areas).
Though it looks like subsequent userspace accesses will then go to
do_anonymous_page, giving ZERO_PAGE to read faults or a fresh anon
page to write faults: I think I'd prefer it if pte_none accesses in
a VM_PFNMAP area gave SIGBUS, but unsure if we can change that now.

> Presumably it would be something like zeromap_page_range(), but that's
> not exported to modules.  Exporting that would be one option, but in a
> way that's overkill for me: I only have a single page to deal with, so
> I could also do something like
> 
> 	vm_insert_page(vma, addr, ZERO_PAGE(addr));
> 
> But do I have to do anything to kill the old mapping coming from
> remap_pfn_range()?  What's the exported API to do that?

vm_insert_page will fail with -EBUSY if there's something there
already; and though it looks as if it'll do the right thing with
ZERO_PAGE(addr) if the mapping was read-only, it'll very nastily
give write access to it if the mapping was read-write.

But I see no point in inserting pages there anyway: the important
part is removing the old ptes, which unmap_mapping_range should do.

> I can keep a list of vmas that have registers mapped to userspace and
> iterate through it on hot unplug.  The only question is what to do
> with those vmas.

unmap_mapping_range will work its own way through the prio_tree of
all vmas mapping your device: so I think it'll spare you some effort.

But it hasn't been used in this way before: I may be missing things.

Hugh
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux