Andi Kleen wrote:
On Wednesday 22 March 2006 07:30, Chris Wright wrote:
-#define load_TR_desc() __asm__ __volatile__("ltr %w0"::"q" (GDT_ENTRY_TSS*8))
-#define load_LDT_desc() __asm__ __volatile__("lldt %w0"::"q" (GDT_ENTRY_LDT*8))
-
-#define load_gdt(dtr) __asm__ __volatile("lgdt %0"::"m" (*dtr))
-#define load_idt(dtr) __asm__ __volatile("lidt %0"::"m" (*dtr))
-#define load_tr(tr) __asm__ __volatile("ltr %0"::"mr" (tr))
-#define load_ldt(ldt) __asm__ __volatile("lldt %0"::"mr" (ldt))
-
-#define store_gdt(dtr) __asm__ ("sgdt %0":"=m" (*dtr))
-#define store_idt(dtr) __asm__ ("sidt %0":"=m" (*dtr))
-#define store_tr(tr) __asm__ ("str %0":"=mr" (tr))
-#define store_ldt(ldt) __asm__ ("sldt %0":"=mr" (ldt))
These are all very infrequent except perhaps LLDT. I suspect trapping would
work too. But ok.
Yes, trapping works fine. Even LLDT is infrequent. But, you do impose
a very large amount of complexity on the hypervisor by trapping on any
instruction. Suppose you wanted to write a minimal hypervisor, which
consisted of pretty much a wrapper based on the Xen or VMI interface
that just stole a couple pages of physical memory, and hooked trap
handlers by hooking the call out for lidt.
If you instead remove the hypervisor wrapper for lidt, and require the
hypervisor to trap and emulate it, you have just imposed an insidious
amount of overhead on it. It doesn't seem too much at first - trap and
emulate, right?
No. First, you have to create a special #GP handler for the general
protection fault. But the fault doesn't tell you anything about why it
happened - just that it was a general protection fault, and maybe a
segment related to it. To figure out what happened, you have to decode
the instruction stream. To decode the instruction stream, you have to
take the EIP pointer and read from it, right? Wrong. You have to
extract segment information from the code segment, apply segmentation
rules to the access, rule out invalid processor modes, deal with wrap
around conditions, etc. But lets say you do all that. Now, you have to
read the guest memory to decode. Which requires reading guest page
tables. The memory in question has to be mapped present and
executable. You have to deal conditionally with PAE / non-PAE paging
modes. And race conditions where self-modifying code can trick you into
decoding something that really didn't happen. Then, finally, you can
interpret the instruction, go through the whole process of reading guest
memory again (fortunately, this time, without segmentation), read the
guest IDT, and hook in your trap handlers where appropriate.
Yeah, it really is that bad, and that is really only a scratch on the
surface. Trap and emulate, while possible, is basically about as evil a
requirement as you can impose on a hypervisor. Everyone who is serious
about the market needs to support it in one form or another eventually,
but it raises the bar to such a point as to stop proliferation of
minimal hypervisors, which could make extremely useful research tools
for the community under an open source license.
Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]