[ Restricting discussion to the i386 bitops implementation. ]
Hi Nick,
On 7/23/07, Satyam Sharma <[email protected]> wrote:
Hi,
On 7/23/07, Nick Piggin <[email protected]> wrote:
> Linus Torvalds wrote:
> >
> > On Fri, 20 Jul 2007, Nick Piggin wrote:
> >
> >>So you did. Then to answer that, yes it could be faster because there are
> >>stupid volatiles sprinkled all over the bitops code so you could easily
> >>end up having to do more loads. Does it make a real difference? Unlikely,
> >>but David loves counting cycles :)
> >
> >
> > I thought we long long since removed the volatiles. They are buggy and
> > horrible, and we really want to let the compiler combine multiple
> > test-bits, and if they matter that implies locking is buggy or something
> > worse..
> >
> > Ie we'd *want*
> >
> > if (test_bit(x, y) || test_bit(z,y))
> >
> > to be rewritten by the compiler as testing bits x/z at the same time.
>
> Yep. We'd also want __set_bit(x, y); __set_bit(z, y); and such to be
> combined.
BTW I'm also running some tests writing test code, compiling and verifying
the code gcc generates ... curiously, volatile-access-casting of the passed
bit-string address is not the only thing that'll prevent gcc's optimizer from
combining the operations such as ones you listed above. Then there are
-O2 vs -Os (and constant_test_bit() vs variable_test_bit()) differences I am
observing ... and sometimes just the inadequacy of gcc's optimizer -- note
that constant_test_bit() seems to go through extra hoops unnecessarily to
avoid honouring @nr >= 32, whereas none of the other primitives in that file
does that. So the i386 kernel's stock constant_test_bit implementation
ends up differing from David's open-coded versions quite drastically in
subtle ways, and again makes it difficult to combine the kind of operations
you guys are discussing here ...
It's a given, of course, that the code that gcc generates when combining
such operations would clearly be more optimal than the simple btl-sbbl pair
with test-and-conditional-jumps that would otherwise get generated ...
> > But now I'm too scared to look.
>
> Not a chance :) Even the asm-generic "reference" implementation ratifies
> the volatile crapiness. Would you take a patch?
Coincidentally, I'm working on a cleanup of the bitops code just now --
I stumbled upon a lot of varied bogosity in there :-)
Such as bogus/invalid asm constraints being passed in the inline assembly.
Probably gcc knows everybody gets its complicated extended asm wrong,
so doesn't barf when parsing such stuff ... :-)
Intend to send it
out in a couple of hours, probably.
Satyam
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]