Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.

On Sun, Aug 27, 2006 at 12:42:13AM -0700, Andrew Morton wrote:
> On Sun, 27 Aug 2006 12:41:16 +0530
> Dipankar Sarma <[email protected]> wrote:
> 
> > On Sat, Aug 26, 2006 at 11:46:18PM -0700, Andrew Morton wrote:
> > > Yes you do.  Please, read _cpu_up(), _cpu_down() and the example in
> > > workqueue_cpu_callback().  It's really very simple.
> > 
> > What are you talking about here ?
> 
> Did you look?  workqueue_mutex is used to protect per-cpu workqueue
> resources.  The lock is taken prior to modification of per-cpu resources
> and is released after their modification.  Very very simple.

I did and there is no lock named workqueue_mutex. workqueue_cpu_callback()
is farily simple and doesn't have the issues in cpufreq that
we are talking about (lock_cpu_hotplug() in cpu callback path).

> 
> > That is the write side. You are
> > *not* supposed to do lock_cpu_hotplug() in cpu callbacks paths AFAICT. 
> 
> The workqueue code doesn't use lock_cpu_hotplug().

And that is the right thing to do.


> > I am talking about readsides here - you read cpu_online_map and
> > block then reuse the map and make some calls to another subsystem
> > that may again do a similar read-side cpu_hotplug lock.
> 
> Two unrelated subsystems which have both independent and interdependent CPU
> hotplug locking requirements and neither of which can protect per-cpu
> resources via preempt_disable()?  Sounds unlikely and undesirable.

I would worry about situations where we have to use set_cpus_allowed()
with cpumasks. IIRC, those weren't trivial to handle and can happen
due to interaction between unrelated subsystems one using services
of the other - rtasd -> set_cpus_allowed() for example.

> > 1. If you are in cpu hotplug callback path, don't take any lock.
> 
> That rule is wrong.  The CPU_UP_PREPARE and CPU_DOWN_PREPARE notification
> entrypoints are the logical place for a subsystem to lock any per-cpu resources
> which another thread/cpu might presently be using.

I meant lock_cpu_hotplug(), not any lock. Of course, susbsystems
may need to use their own lock there to handle per-cpu data there.


> > 2. If you are in a non-hotplug path reading cpu_online_map and you don't
> >    block, you just disable preemption and you are safe from hotplug.
> 
> Sure.
> 
> > 3. If you are in a non-hotplug path and you use cpu_online_map and
> >    you *really* need to block, you use lock_cpu_hotplug() or 
> >    cpu_hotplug_disable whatever it is called.
> > 
> > Is this too difficult for people to follow ?
> 
> Apparently.  What's happening is that lock_cpu_hotplug() is seen as some
> amazing thing which will prevent an *event* from occurring.
> 
> There's an old saying "lock data, not code".  What data is being locked
> here?  It's the subsystem's per-cpu resources which we want to lock.  We
> shouldn't consider the lock as being some way of preventing an event from
> happening.

That is what I argued for earlier, but I was given some examples
where they really needed to disable the asynchronous event of
cpu hotplug - otherwise they would have need to use very complex
multi-layer locking.

> > > > seem to have just got lazy with lock_cpu_hotplug().
> > > 
> > > That's because lock_cpu_hotplug() purports to be some magical thing which
> > > makes all your troubles go away.
> > 
> > No it doesn't. Perhaps we should just document the rules better
> > and put some static checks for people to get it right.
> 
> Yes, we could probably fix cpufreq using the existing lock_cpu_hotplug(). 
> But we have a quite large amount of racy-wrt-cpu-hotplug code in the kernel
> and although a lot of it can be fixed with preempt_disable(), it's possible
> that we'll get into scalability problems.
> 
> If we do have scalability problems, they can be fixed on a per-subsystem
> basis: the affected subsystem can use per-cpu locking of its per-cpu data
> within its CPU_UP_PREPARE and CPU_DOWN_PREPARE handlers.  That's a local,
> contained issue, and addressing it this way is better than inventing (and
> debugging) some fancy new lock type.

I would suggest an audit of lock_cpu_hotlpug() users to start with.

Thanks
Dipankar
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Andrew Morton <[email protected]>

References:
- [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Gautham R Shenoy <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Andrew Morton <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Dave Jones <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Linus Torvalds <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Andrew Morton <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Dipankar Sarma <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Andrew Morton <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Dipankar Sarma <[email protected]>
- Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
  - From: Andrew Morton <[email protected]>

Prev by Date: Re: wrt: dma_intr: status=0x51 { DriveReady SeekComplete Error }
Next by Date: [PATCH 2.6.18-rc4 0/5] drivers/ieee1394/sbp2: fixes for 2.6.18-rc
Previous by thread: Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
Next by thread: Re: [RFC][PATCH 0/4] Redesign cpu_hotplug locking.
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]