Re: Relayfs Question: Use of relay_reset(). Potential race?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 19, 2005 at 10:40:23AM +1000, Kingsley Cheung wrote:
> On Mon, Apr 18, 2005 at 10:56:57AM -0500, Tom Zanussi wrote:
> > Kingsley Cheung writes:
> >  > On Wed, Mar 23, 2005 at 08:02:54PM +1100, [email protected] wrote:
> >  > > Hi 
> >  > > 
> >  > > I'm using relayfs to relay data from a kernel module to user space on
> >  > > a SuSE 2.6.5 kernel.  I'm not absolutely sure what version of relayfs
> >  > > has been back ported to it.
> >  > 
> >  > Hi Tom,
> >  > 
> >  > Could you please have a look at the following use of relay_reset() in
> >  > a kernel module as follows (compiled against pre-redux relayfs):
> (snip)
> >  > Is that legitimate?  The reason I ask is because I've been seeing
> > 
> > Yes, you should be able to reset the channel here, since at that point
> > it's been closed.
> > 
> >  > garbled oopses with keventd and I've narrowed it to two things:
> >  > 
> >  > 1) Inadequate locking on my part in the kernel module, which I have
> >  > addressed separately.
> >  > 
> >  > 2) A race with relay_reset() and keventd, which is probably of
> >  > interest to you if you're still maintaining the pre-redux patches.
> >  > 
> >  > The race is due to the use of INIT_WORK in _reset_relay():
> >  > 
> >  > INIT_WORK(&rchan->wake_readers, NULL, NULL);
> >  > INIT_WORK(&rchan->wake_writers, NULL, NULL);
> >  > 
> >  > However, at the time relay_reset() is called, it is possible that
> >  > these work structures are still being used by keventd when under heavy
> >  > loads.  The workaround I've used to fix this is to call
> >  > flush_scheduled_work() before calling reset_relay() in the kernel
> >  > module.  Perhaps that needs to be called in relay_reset() or
> >  > _relay_reset()?
> 
> Tom,
> 
> Thanks for the prompt response. 
> 
> > Yes, flush_scheduled_work() should probably be called from
> > __relay_reset() - thanks for catching this and suggesting the fix.

Tom,

Sorry about this, but the fix only works if schedule_work() is used
instead of schedule_delayed_work() for the pre-redux relayfs patch.  I
only realised this recently since I have been using the "flush" fix on
a pre-redux relayfs port to 2.4 which doesn't doesn't have an
equivalent of schedule_delayed_work() and have only tried it on 2.6
today.

Using flush_scheduled_work() with schedule_delayed_work() doesn't work
since schedule_delay_work() uses a timer to queue work at the
appropriate time.  Although all uses of schedule_delay_work() in
relayfs adds work to the queue a tick later, this time delay is enough
for flush_schedule_work() to flushes what is presently on the queue
before the timer actually expires.  eventd would then attempt to use a
work_struct that has then been initialised by relay_reset().

With schedule_work() is instead of schedule_delayed_work() I haven't
seen the race happen.

Thanks,
-- 
		Kingsley
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux