Re: [PATCH] cpuset semaphore depth check optimize

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Roman wrote:
> Sorry, I thought it was more obvious... :)

Likely it was obvious.  On the other hand, I wouldn't be in this mess
if I understood everything I should here.

Most of what you wrote this time actually made sense to me as I read
it.  Thanks for taking the time.

One part left to discuss and then I will put this aside for a few
days while I focus on my day job.


Roman, replying to pj:
> > > I don't think that works at all.  Consider the following sequence:
> > > 	1) ...
> > > 	4) Oops - we just missed doing a release.
> > 
> > If you want to do it like this, just add an else before the last 
> > atomic_dec().
> > The main point I was trying to make is that clearing tsk->cpuset doesn't 
> > require the spinlock, as long as you check for dead tasks in 
> > attach_task(). Ignore the rest if it's too confusing.

I don't think confusion is the problem here.  I think your proposal was
wrong.

There are *three* ways to get to a cpuset:
 1) via some task->cpuset pointer, 
 2) via that cpuset's path below /dev/cpuset, and
 3) walking up the parent chain from a child.

In the first half of your line:

> > 	if (atomic_read(&cs->count) == 1 && notify_on_release(cs)) {

if per chance the cs->count was one, then for an instant no other task
was using this cpuset, and it had no children.  But you can still get
to the cpuset via its /dev/cpuset path.

So by the time we get to the second half of this line where we check
for "notify_on_release(cs)", all hell could have broken loose, and
there might be 17 tasks using this self same cpuset, and 19 child
cpusets of this cpuset.  These interlopers. could have arrived by
accessing the cpuset using its path below /dev/cpuset.

The flip side is just as plausible.  We cannot, in any case, execute an
unguarded atomic_dec on cpuset->count, if that cpuset has been marked
notify_on_release, and if that cpuset is accessible by any of the
above three possible ways, due to the risk the decrement will put the
count to zero, and we'd miss issuing a release notifier.

Actually, even the single check that is in the code now:

        if (notify_on_release(cs)) {

would be bogus if we required instruction level synchronization, but it
is racing on an attribute set by some poor schlob user, so the kernel
need only preserve ordering at the system call level, not at the machine
instruction level.

Putting that part aside, why would you make a point of stating that
"that clearing tsk->cpuset doesn't require the spinlock"?  I don't
take cpuset_sem when I clear tsk->cpuset, so why would you think I'd
take this new spinlock instead?

-- 
                  I won't rest till it's the best ...
                  Programmer, Linux Scalability
                  Paul Jackson <[email protected]> 1.925.600.0401
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux