Re: [PATCH 6/13] KVM: memory slot management

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Arnd Bergmann wrote:
On Friday 27 October 2006 07:47, Avi Kivity wrote:
Arnd Bergmann wrote:
- no need to preallocate memory that the guest doesn't actually use.
Well, a fully vitrualized guest will likely use all the memory it gets. Linux certainly will.

Only if it does lots of disk accesses that load stuff into
page/inode/dentry cache. Single-application guests don't necessarily
do that.


Okay.  FWIW, you can demand allocate with other schemes as well.

- guest memory can be paged to disk.
- you can mmap files into multiple guest for fast communication
- you can use mmap host files as backing store for guest blockdevices,
  including ext2 with the -o xip mount option to avoid double paging
What do you mean exactly? to respond to a block device read by mmap()ing the backing file into the pages the host requested?

(e.g. turn a host bio read into a guest mmap)

The idea would be to mmap the file into the guest real address space.
With -o xip, the page cache for the virtual device would basically
reside in that high address range.

Ah, I see what you mean now.  Like the "memory technology device" thing.



Guest users reading/writing files on it cause a memcopy between guest
user space and the host file mapping, done by the guest file system
implementation.

The interesting point here is how to handle a host page fault on the
file mapping. The solution on z/VM for this is to generate a special
exception for this that will be caught by the guest kernel, telling
it to wait until the page is there. The guest kernel can then put the
current thread to sleep and do something else, until a second exception
tells it that the page has been loaded by the host. The guest then
wakes up the sleeping thread.

This can work the same way for host file backed (guest block device)
and host anonymous (guest RAM) memory.


Certainly something like that can be done, for paravirtualized guests.

If we allow the pages to be writable, the guest could write into the virtual block device just by modifying a read page (which might have be discarded and no longer related to the block device)

In your virtual mmu (or nested page table), you need to make sure that
the page is mapped with the intersection of the guest vm_prot and host
vm_prot into guest users.


Yes.  My comment was based on an incorrect understanding of your suggestion.

2. The next mmu implementation, which caches guest translations.

The potential problem above now becomes acute. The guest will have kernel mappings for every page, and after a short while they'll all be faulted in and locked. This defeats the swap integration which is IMO a very strong point.

We can work around that by periodically forcing out translations (some kind of clock algorithm) at some rate so the host vm can have a go at them. That can turn out to be expensive as we'll need to interrupt all running vcpus to flush (real) tlb entries.

Don't understand. Can't one CPU cause a TLB entry to be flushed on all
CPUs?


It's not about tlb entries. The shadow page tables collaples a GV -> HV -> HP double translation into a GV -> HP page table. When the Linux vm goes around evicting pages, it invalidates those mappings.

There are two solutions possible: lock pages which participate in these translations (and their number can be large) or modify the Linux vm to consult a reverse mapping and remove the translations (in which case TLB entries need to be removed).

b. we need to hide the userspace portion of the monitor from the guest physical address space

That depends on your trust model. You could simply say that you expect
the guest real mode to have the same privileges as the host application
(your monitor), and not care if a guest can shoot itself in the foot
by overwriting the monitor.

It can shoot not only its foot, but anything the monitor's uid has access to. Host files, the host network, other guests belonging to the user, etc.

  c.  we need to extend host tlb invalidations to invalidate tlbs on guests

I don't understand much about the x86 specific memory management,
but shouldn't a TLB invalidate of a given page do the right thing
on all CPUs, even if they are currently running a guest?
It's worse than I thouht: tlb entries generated by guest accesses are tagged with the guest virtual address, to if you remove a guest physical/host virtual page you need to invalidate the entire guest tlb.


--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux