Re: [Security] Signed divides vs shifts (Re: /dev/urandom uses uninit bytes, leaks user data)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Mon, 17 Dec 2007, Eric Dumazet wrote:
> 
> while 
> 
> long *mid(long *a, long *b)
> {
> 	return ((a - b) / 2u + a);
> }

This is exactly what I'm talking about. That "2u" is TOTALLY POINTLESS. 
It's an "unsigned int", but since (a-b) will be of type ptrdiff_t, and is 
*wider* on a 64-bit architecture (it's the same as "long" on x86-64), then 
the 2u will just be converted to "long", and be signed again! 

So you thought that you did an unsigned divide, but you did no such thing. 

If you change the "2u" to a "2ul", it works again, and you get

	mid:
	        movq    %rdi, %rax
	        subq    %rsi, %rax
	        sarq    %rax
	        andq    $-8, %rax
	        addq    %rdi, %rax
	        ret

which is the code you wanted. But quite frankly, you could just have 
written it with a shift to start with, and avoided the subtle type issue, 
although gcc then generates

        movq    %rdi, %rax
        subq    %rsi, %rax
        sarq    $4, %rax
        leaq    (%rdi,%rax,8), %rax
        ret

instead. Of course, this all *does* still have subtle sign issues, because 
the "a-b" part implies a signed divide in itself, which is why you see 
that "sarq" in he first place (rather than a "shrq"). 

Signed divides are hard. The "a-b" pointer subtraction is actually cheaper 
than a general signed divide by sizeof, since the compiler can then assume 
that the two pointers are mutually aligned, which is why gcc can generate 
just a single "sarq" instead of having to do an extra "add negative bit" 
thing to get the rounding right.

[ So Al, when you said that

	(a-b)

  is equivalent to

	((char *)a-(char *)b)/4

  for a "int *" a and b, you're right in the sense that the *result* is 
  the same, but the code generation likely isn't. The "a-b" thing can (and 
  does) allow the compiler to avoid the whole "align up for signed 
  numbers" thing, and the difference in code generation is clear:

	subq    %rsi, %rdi
	sarq    $2, %rdi

  vs

	subq    %rsi, %rdi
	leaq    3(%rdi), %rax
	testq   %rdi, %rdi
	cmovs   %rax, %rdi
	sarq    $2, %rdi

  exactly because the first case *knows* that the low two bits have to be 
  zero, and thus there is no rounding issue. ]

			Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux