Re: Hyper-Threading Vulnerability

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 13 May 2005, Andy Isaacson wrote:

> On Fri, May 13, 2005 at 09:05:49PM +0200, Andi Kleen wrote:
> > On Fri, May 13, 2005 at 02:38:03PM -0400, Richard F. Rebel wrote:
> > > Why?  It's certainly reasonable to disable it for the time being and
> > > even prudent to do so.
> > 
> > No, i strongly disagree on that. The reasonable thing to do is
> > to fix the crypto code which has this vulnerability, not break
> > a useful performance enhancement for everybody else.
> 
> Pardon me for saying so, but that's bullshit.  You're asking the crypto
> guys to give up a 5x performance gain (that's my wild guess) by giving
> up all their data-dependent algorithms and contorting their code wildly,
> to avoid a microarchitectural problem with Intel's HT implementation.

i think your wild guess is way off.  i can think of several approaches to 
fix these problems which won't be anywhere near 5x.

the problem is that an attacker can observe which cache indices (rows) are 
in use.  one workaround is to overload the possible secrets which each 
index represents.

you can overload the secrets in each cache line:  for example when doing 
exponentiation there is an array of bignums x**(2*n).  bignums themselves 
are arrays (which span multiple cache lines).  do a "row/column transpose" 
on this array of arrays -- suddenly each cache line contains a number of 
possible secrets.  if you're operating with 32-bit words in a 64 byte line 
then you've achieved a 16-fold reduction in exposed information by this 
transpose.  there'll be almost no performance penalty.

you can overload the secrets in each cache index:  abuse the associativity 
of the cache.  the affected processors are all 8-way associative.  
ideally you'd want to arrange your data so that it all collides within the 
same cache index -- and get an 8-fold reduction in exposure.  the trick 
here is the L2 is physically indexed, and userland code can perform only 
virtual allocations.  but it's not too hard to discover physical conflicts 
if you really want to (using rdtsc) -- it would be done early in the 
initialization of the program because it involves asking for enough memory 
until the kernel gives you enough colliding pages.  (a system call could 
help with this if we really wanted it.)

my not-so-wild guess is a 128-fold reduction for less than 10% perf hit...

i think there's possibly another approach involving a permuted array of 
indirection pointers... which is going to affect perf a bit due to the 
extra indirection required, but we're talking <10% here.  (i'm just not 
convinced yet you can select a permutation in a manner which doesn't leak 
information when the attacker can view multiple invocations of the crypto 
for example.)


> If SHA has plaintext-dependent memory references, Colin's technique
> would enable an adversary to extract the contents of the /dev/random
> pools.  I don't *think* SHA does, based on a quick reading of
> lib/sha1.c, but someone with an actual clue should probably take a look.

the SHA family do not have any data-dependencies in their memory access 
patterns.

-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux