Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync

On Tue, Nov 21, 2006 at 12:56:21PM -0500, Alan Stern wrote:
> Here's another potential problem with the fast path approach.  It's not
> very serious, but you might want to keep it in mind.
> 
> The idea is that a reader can start up on one CPU and finish on another,
> and a writer might see the finish event but not the start event.  For
> example:
> 
> 	Reader A enters the critical section on CPU 0 and starts
> 	accessing the old data area.
> 
> 	Writer B updates the data pointer and starts executing
> 	srcu_readers_active_idx() to check if the fast path can be
> 	used.  It sees per_cpu_ptr(0)->c[idx] == 1 because of
> 	Reader A.
> 
> 	Reader C runs srcu_read_lock() on CPU 0, setting
> 	per_cpu_ptr[0]->c[idx] to 2.
> 
> 	Reader C migrates to CPU 1 and leaves the critical section;
> 	srcu_read_unlock() sets per_cpu_ptr(1)->c[idx] to -1.
> 
> 	Writer B finishes the cpu loop in srcu_readers_active_idx(),
> 	seeing per_cpu_ptr(1)->c[idx] == -1.  It computes sum =
> 	1 + -1 == 0, takes the fast path, and exits immediately
> 	from synchronize_srcu().
> 
> 	Writer B deallocates the old data area while Reader A is still
> 	using it.
> 
> This requires two context switches to take place while the cpu loop in
> srcu_readers_active_idx() runs, so perhaps it isn't realistic.  Is it
> worth worrying about?

Thank you -very- -much- for finding the basis behind my paranoia!
I guess my intuition is still in good working order.  ;-)

It might be unlikely, but that makes it even worse -- a strange memory
corruption problem that happens only under heavy load, and even then only
sometimes.  No thank you!!!

I suspect that this affects Jens as well, though I don't claim to
completely understand his usage.

One approach to get around this would be for the the "idx" returned from
srcu_read_lock() to keep track of the CPU as well as the index within
the CPU.  This would require atomic_inc()/atomic_dec() on the fast path,
but would not add much to the overhead on x86 because the smp_mb() imposes
an atomic operation anyway.  There would be little cache thrashing in the
case where there is no preemption -- but if the readers almost always sleep,
and where it is common for the srcu_read_unlock() to run on a different CPU
than the srcu_read_lock(), then the additional cache thrashing could add
significant overhead.

Thoughts?

							Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: Oleg Nesterov <oleg@tv-sign.ru>
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: Alan Stern <stern@rowland.harvard.edu>

References:
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
  - From: Alan Stern <stern@rowland.harvard.edu>

Prev by Date: Re: Where did find_bus() go in 2.6.18?
Next by Date: Re: splice/vmsplice performance test results
Previous by thread: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
Next by thread: Re: [patch] cpufreq: mark cpufreq_tsc() as core_initcall_sync
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]