On Fri, Sep 21, 2007 at 07:23:09PM -0400, Steven Rostedt wrote:
> --
> On Fri, 21 Sep 2007, Paul E. McKenney wrote:
>
> > If you do a synchronize_rcu() it might well have to wait through the
> > following sequence of states:
> >
> > Stage 0: (might have to wait through part of this to get out of "next" queue)
> > rcu_try_flip_idle_state, /* "I" */
> > rcu_try_flip_waitack_state, /* "A" */
> > rcu_try_flip_waitzero_state, /* "Z" */
> > rcu_try_flip_waitmb_state /* "M" */
> > Stage 1:
> > rcu_try_flip_idle_state, /* "I" */
> > rcu_try_flip_waitack_state, /* "A" */
> > rcu_try_flip_waitzero_state, /* "Z" */
> > rcu_try_flip_waitmb_state /* "M" */
> > Stage 2:
> > rcu_try_flip_idle_state, /* "I" */
> > rcu_try_flip_waitack_state, /* "A" */
> > rcu_try_flip_waitzero_state, /* "Z" */
> > rcu_try_flip_waitmb_state /* "M" */
> > Stage 3:
> > rcu_try_flip_idle_state, /* "I" */
> > rcu_try_flip_waitack_state, /* "A" */
> > rcu_try_flip_waitzero_state, /* "Z" */
> > rcu_try_flip_waitmb_state /* "M" */
> > Stage 4:
> > rcu_try_flip_idle_state, /* "I" */
> > rcu_try_flip_waitack_state, /* "A" */
> > rcu_try_flip_waitzero_state, /* "Z" */
> > rcu_try_flip_waitmb_state /* "M" */
> >
> > So yes, grace periods do indeed have some latency.
>
> Yes they do. I'm now at the point that I'm just "trusting" you that you
> understand that each of these stages are needed. My IQ level only lets me
> understand next -> wait -> done, but not the extra 3 shifts in wait.
>
> ;-)
In the spirit of full disclosure, I am not -absolutely- certain that
they are needed, only that they are sufficient. Just color me paranoid.
> > > True, but the "me" confused me. Since that task struct is not me ;-)
> >
> > Well, who is it, then? ;-)
>
> It's the app I watch sitting there waiting it's turn for it's callback to
> run.
:-)
> > > > > Isn't the GP detection done via a tasklet/softirq. So wouldn't a
> > > > > local_bh_disable be sufficient here? You already cover NMIs, which would
> > > > > also handle normal interrupts.
> > > >
> > > > This is also my understanding, but I think this disable is an
> > > > 'optimization' in that it avoids the regular IRQs from jumping through
> > > > these hoops outlined below.
> > >
> > > But isn't disabling irqs slower than doing a local_bh_disable? So the
> > > majority of times (where irqs will not happen) we have this overhead.
> >
> > The current code absolutely must exclude the scheduling-clock hardirq
> > handler.
>
> ACKed,
> The reasoning you gave in Peter's reply most certainly makes sense.
>
> > > > > > + *
> > > > > > + * If anyone is nuts enough to run this CONFIG_PREEMPT_RCU implementation
> > > > >
> > > > > Oh, come now! It's not "nuts" to use this ;-)
> > > > >
> > > > > > + * on a large SMP, they might want to use a hierarchical organization of
> > > > > > + * the per-CPU-counter pairs.
> > > > > > + */
> > > >
> > > > Its the large SMP case that's nuts, and on that I have to agree with
> > > > Paul, its not really large SMP friendly.
> > >
> > > Hmm, that could be true. But on large SMP systems, you usually have a
> > > large amounts of memory, so hopefully a really long synchronize_rcu
> > > would not be a problem.
> >
> > Somewhere in the range from 64 to a few hundred CPUs, the global lock
> > protecting the try_flip state machine would start sucking air pretty
> > badly. But the real problem is synchronize_sched(), which loops through
> > all the CPUs -- this would likely cause problems at a few tens of
> > CPUs, perhaps as early as 10-20.
>
> hehe, From someone who's largest box is 4 CPUs, to me 16 CPUS is large.
> But I can see hundreds, let alone thousands of CPUs would make a huge
> grinding halt on things like synchronize_sched. God, imaging if all CPUs
> did that approximately at the same time. The system would should a huge
> jitter.
Well, the first time the SGI guys tried to boot a 1024-CPU Altix, I got
an email complaining about RCU overheads. ;-) Manfred Spraul fixed
things up for them, though.
Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]