On Mon, 1 Aug 2005, Hugh Dickins wrote:
>
> > Aside, that brings up an interesting question - why should readonly
> > mappings of writeable files (with VM_MAYWRITE set) disallow ptrace
> > write access while readonly mappings of readonly files not? Or am I
> > horribly confused?
>
> Either you or I. You'll have to spell that out to me in more detail,
> I don't see it that way.
We have always just done a COW if it's read-only - even if it's shared.
The point being that if a process mapped did a read-only mapping, and a
tracer wants to modify memory, the tracer is always allowed to do so, but
it's _not_ going to write anything back to the filesystem. Writing
something back to an executable just because the user happened to mmap it
with MAP_SHARED (but read-only) _and_ the user had the right to write to
that fd is _not_ ok.
So VM_MAYWRITE is totally immaterial. We _will_not_write_ (and must not do
so) to the backing store through ptrace unless it was literally a writable
mapping (in which case VM_WRITE will be set, and the page table should be
marked writable in the first case).
So we have two choices:
- not allow the write at all in ptrace (which I think we did at some
point)
This ends up being really inconvenient, and people seem to really
expect to be able to write to readonly areas in debuggers. And doign
"MAP_SHARED, PROT_READ" seems to be a common thing (Linux has supported
that pretty much since day #1 when mmap was supported - long before
writable shared mappings were supported, Linux accepted MAP_SHARED +
PROT_READ not just because we could, but because Unix apps do use it).
or
- turn a shared read-only page into a private page on ptrace write
This is what we've been doing. It's strange, and it _does_ change
semantics (it's not shared any more, so the debugger writing to it
means that now you don't see changes to that file by others), so it's
clearly not "correct" either, but it's certainly a million times better
than writing out breakpoints to shared files..
At some point (for the longest time), when a debugger was used to modify a
read-only page, we also made it writable to the user, which was much
easier from a VM standpoint. Now we have this "maybe_mkwrite()" thing,
which is part of the reason for this particular problem.
Using the dirty flag for a "page is _really_ writable" is admittedly kind
of hacky, but it does have the advantage of working even when the -real-
write bit isn't set due to "maybe_mkwrite()". If it forces the s390 people
to add some more hacks for their strange VM, so be it..
[ Btw, on a totally unrelated note: anybody who is a git user and looks
for when this maybe_mkwrite() thing happened, just doing
git-whatchanged -p -Smaybe_mkwrite mm/memory.c
in the bkcvs conversion pinpoints it immediately. Very useful git trick
in case you ever have that kind of question. ]
I added Martin Schwidefsky to the Cc: explicitly, so that he can ping
whoever in the s390 team needs to figure out what the right thing is for
s390 and the dirty bit semantic change. Thanks for pointing it out.
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|