Hello, Nick.
Nick Piggin wrote:
Tejun Heo wrote:
As I've said in the other reply, I'll first rip off three stage init
thing for cpucontrol. That added pretty much complexity to it. And
with the weird naming scheme, please tell me how to fix it. I have no
problem renaming things.
OK, my criticism of your naming was not constructive so I apologise
for that. I willll help you with some of those minor issues if we
establish that your overall design is a a goer.
Thanks. :-)
What would be wrong with an array of NR_CPUS rwsems? The only
tiny trick you would have to do AFAIKS is have up_read remember
what rwsem down_read took, but that could be returned from
down_read as a token.
Using array of rwsems means that every reader-side ops performs
(unnecessary) real down and up operations on the semaphore which
involve atomic memory op and probably memory barrier. These are
pretty expensive operations.
What brsem tries to do is implementing rwsem semantics while
performing only normal (as opposed to atomic/barrier) intstructions
during reader-side operations. That's why all the call_on_all_cpus
stuff is needed while write-locking. Do you think avoiding
atomic/barrier stuff would be an overkill?
Yes I think so. I think the main problem on modern CPUs is
not atomic operations as such, but cacheline bouncing.
Without any numbers, I'd guess that your down_read is more
expensive than mine simply due to touching more cachelines
and having more branches.
Other than local_bh_disable/enable(), all brsem read ops do are
1. accessing sem->idx
2. calculate per-cpu rcnt address from sem->idx
3. do one branch on the value of per-cpu rcnt
4. inc/dec per-cpu rcnt
So, it does access one more cachline and, yeap, there is one branch
for bh enabling and several more inside local_bh_enable. I'll try to
get some benchmark numbers for comparison.
I'm thinking about adding down_read(&xxx->s_umount) to write(2) and
compare normal rwsem and brsem performance by repeitively writing short
data into a file on a UP machine. Do you have better ideas?
The other thing is simply that you really want your
synchronization primitives to be as simple and verifiable
as possible. For example rwsems even recently have had subtle
memory ordering and other implemntation corner cases, and
they're much less complex than this brsem.
Nick
Thanks.
--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
|
|