Re: [PATCH] Add I/O hypercalls for i386 paravirt

Andi Kleen wrote:

How is that measured? In a loop? In the same pipeline state?

It seems a little dubious to me.

I did the experiments in a controlled environment, with interruptsdisabled and care to get the pipeline in the same state. It was aperfectly repeatable experiment. I don't have exact cycle time anymore,but they were the tightest measurements I've even seen on cycle countsbecause of the unique nature of serializing the processor for the fault/ privilege transition. I tested a variety of different conditions,including different types of #GP (yes, the cost does vary), #NP, #PF,sysenter, int $0xxx. Sysenter was the fastest, by far. Int was about5x the cost. #GP and friends were all about similar costs. #PF was themost expensive.

to verify protection in the page tables mapping the page allowsexecution (P, !NX, and U/S check). This is a lot more expensive than a
When the page is not executable or not present you get #PF not #GP.So the hardware already checks that.
The only case where you would need to check yourself is if you emulate
NX on non NX capable hardware, but I can't see you doing that.
No, it doesn't. Between the #GP and decode, you have an SMP race whereanother processor can rewrite the instruction.
That can be ignored imho. If the page goes away you'll notice
when you handle the page fault on read. If it becomes NX then the execution
just happened to be logically a little earlier.

No, you can't ignore it. The page protections won't change between theGP and the decoder execution, but the instruction can, causing you todecode into the next page where the processor would not have. !Pbecomes obvious, but failure to respect NX or U/S is an exploitablebug. Put a 1 byte instruction at the end of a page crossing into a NX(or supervisor page). Remotely, change keep switching between theinstruction and a segment override.

Result: user executes instruction on supervisor code page, learning dataas a result of this; code on NX page gets executed.

Or easier to just write a backend for the lguest virtio drivers,
that will be likely faster in the end anyways than this gross
hack.

We already have drivers for all of our hardware in Linux. Most of thehardware we emulate is physical hardware, and there are no virtualdrivers for it. Should we take the BusLogic driver and "paravirtualize"it by adding VMI hypercalls? We might benefit from it, but would theBusLogic driver? It sets a nasty precedent for maintenance as differenthypervisors and emulators hack up different drivers for their ownperformance.

Our SCSI and IDE emulation and thus the drivers used by Linux are prettymuch fixed in stone; we are not going to go about changing a trickyhardware interface to a virtual one, it is simply too risky forsomething as critical as storage. We might be able to move our networkdriver over to virtio, but that is not a short-term prospect either.

There is great advantage in talking to our existing device layer faster,and this is something that is valuable today.

Really LinuxHAL^wparavirt ops is already so complicated that
any new hooks need an extremly good justification and that is
just not here for this.

We can add it if you find an equivalent number of hooks
to eliminate.

Interesting trade. What if I sanitized the whole I/O messy macros intosomething fun and friendly:

native_port_in(int port, iosize_t opsize, int delay)
native_port_out(int port, iosize_t opsize, u32 output, int delay)

native_port_string_in(int port, void *ptr, iosize_t opsize, unsignedcount, int delay)native_port_string_out(int port, void *ptr, iosize_t opsize, unsignedcount, int delay)

Then we can be rid of all the macro goo in io.h, which frightens mymother. We might even be able to get rid of the umpteen differentplaces where drivers wrap iospace access with their own byte / word /long functions so they can switch between port I/O and memory mapped I/Oby moving it all into common infrastructure.

We could make similar (unwelcome?) advances on the pte functions if itwere not for the regrettable disconnect between pte_high / pte_low andthe rest. Perhaps if it was hidden in macros?

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Andi Kleen <ak@suse.de>

References:
- [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Zachary Amsden <zach@vmware.com>
- Re: [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Andi Kleen <ak@suse.de>
- Re: [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Zachary Amsden <zach@vmware.com>
- Re: [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Andi Kleen <ak@suse.de>
- Re: [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Zachary Amsden <zach@vmware.com>
- Re: [PATCH] Add I/O hypercalls for i386 paravirt
  - From: Andi Kleen <ak@suse.de>

Prev by Date: Re: [BUG] 2.6.23-rc3-mm1 kernel BUG at mm/page_alloc.c:2876!
Next by Date: Re: Forcedeth: Nvidia NIC goes up and down
Previous by thread: Re: [PATCH] Add I/O hypercalls for i386 paravirt
Next by thread: Re: [PATCH] Add I/O hypercalls for i386 paravirt
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]