On Mon, Nov 06, 2006 at 07:23:38AM +0100, Manfred Spraul wrote:
> Paul E. McKenney wrote:
>
> >I also don't understand why the code in sys_msgrcv() doesn't have
> >to remap the msqid, similar to the way it is done in sys_semtimedop().
> >
> What do you mean with remap?
What ipc_lock() does, more or less.
> >So, what am I missing here? How does a msgrcv() racing with an rmid()
> >avoid taking a lock on a message queue that just got freed? (The
> >ipc_lock_by_ptr() in "Lockless receive, part 3".) My concern is the
> >following sequence of steps:
> >
> >o expunge_all() invokes wake_up_process() and sets r_msg.
> >
> >o sys_msgrcv() is awakened, but for whatever reason does
> > not actually start executing (e.g., lots of other busy
> > processes at higher priority).
> >
> >o expunge_all() returns to freeque(), which runs through the
> > rest of its processing, finally calling ipc_rcu_putref().
> >
> >o ipc_rcu_putref() invokes call_rcu() to free the message
> > queue after a grace period.
> >
> >o ipc_immediate_free() is invoked at the end of a grace
> > period, freeing the message queue.
> >
> >o sys_msgrcv() finally gets a chance to run, and does an
> > rcu_read_lock() -- but too late!!!
> >
> Not too late:
> sys_msgrcv() checks msr_d.r_msr, notices that the value is -EIDRM and
> returns to user space with -EIDRM immediately. This codepath
> doesn't touch the message queue pointer, thus it doesn't matter that the
> message queue is already freed.
> The code only touches the message queue pointer if msr_d.r_msr
> is -EAGAIN - and the rcu_read_lock() guarantees there is no rcu grace
> period between the test for -EAGAIN and the ipc_lock_by_ptr.
> Thus this should be safe.
OK, seems like it does handle that scenario. Sorry for the noise!
And the other possible scenario, where the wakeup happens before the
assignment of NULL to msr_d.r_msr, be prevented by the two rounds of
rq lock (one in try_to_wake_up(), the other when the task actually
starts running).
Thanx, Paul
> But back to the oops:
> The oops happens in expunge_all, called from sys_msgctl.
> Thus it must be an msgctl(IPC_SET).
> IPC_SET is special: it calls expunge_all(-EAGAIN): that's necessary
> because IPC_SET can change the permissions.
> Unfortunately, faked doesn't use IPC_SET at all :-(
>
> Falk - could you strace your "fakeroot ls" test? Are there any IPC_SET
> calls?
> Which gcc version do you use? Is it possible that gcc auto-inlined
> something?
>
> --
> Manfred
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]