Re: [patch 4/7] Immediate Values - i386 Optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Mathieu Desnoyers ([email protected]) wrote:
> * Jeremy Fitzhardinge ([email protected]) wrote:
> > H. Peter Anvin wrote:
> > > Allowing different registers should be doable, but if so, one would have
> > > to put 0: at the *end* of the instruction and use (0f)-4 instead, since
> > > the non-%eax forms are one byte longer.
> > >   
> > 
> > OK, that's already a problem since its using "=r" as the constraint.
> > 
> > > This also seems "safer", since an imm32 is always the last thing in the
> > > instruction.
> > 
> > Good idea.  If gas/gcc generates entirely the wrong addressing mode,
> > then we've got bigger problems.
> > 
> 
> Ok, let's have a good look at what we want:
> 
> 1 - get a pointer to the beginning of the immediate value within the
>     instruction.
> 2 - make sure that the immediate value, within the instruction, is
>     written to atomically wrt all CPUs, even on older architectures
>     where non aligned writes are not atomic.
> 
> Effectively, placing a label at the end of the instruction, and then
> offsetting backward from there, will give us (1).
> 
> Then, for the alignment, we have to give a good look at the instruction
> set reference, mov instruction, to see what the variants of mov
> immediate value to register are on i386.
> 
> First, let's look at the possible prefixes:
> - Lock and repeat prefixes : no
> - Segment override prefixes: no memory reference there, so doesn't
>   apply.
> - Branch hints : no, it's a mov instruction
> - Operand-size override prefix: *yes*, can be used for 2 bytes mov
> - Address-size override prefix: no address in there, only immediate
>   value and register.
> 
> (looking at the Compat/Leg Mode)
> 
> 3 cases:
> 
> * 1 byte
> B0 + rb         MOV r8, imm8     (1 byte opcode)
> REX + B0 + rb   MOV r8, imm8     (only on 64 bits archs, never generated)
> C6 /0           MOV r/m8, imm8   (2 bytes opcode)
> (this one doesn't require alignment at all, since we do a 1 byte write)
> 
> * 2 bytes
> B8 + rw         MOV r16, imm16   (1 byte opcode)
> 66 B8 + rd      MOV r16, imm16   (2 bytes opcode) (with 66H prefix)
> C7 /0           MOV r/m16, imm16 (2 bytes opcode)
> (Alignment on 4 bytes boundaries would be required because of the
> possible 1 byte opcode ? Or is the 66H prefix mandatory there ? If it
> is, then we can safely align on 2 bytes boundaries.)
> 
> * 4 bytes
> B8 + rd         MOV r32, imm32   (1 byte opcode)
> C7 /0           MOV r/m32, imm32 (2 bytes opcode)
> (the 2 bytes opcode can be a problem)
> 
> I have missed anything ?
> 

This is not a problem in practice because we place a breakpoint and use
a bypass when we are updating immediate values that can be used
concurrently. So let's not bother about that too much. It just means
that the breakpoint approach might have to be used for more
architectures in the future to address that kind of issue.

Mathieu


-- 
Mathieu Desnoyers
Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux