Re: [announce] [patch] KVM paravirtualization for Linux

Ingo Molnar wrote:

* Avi Kivity <avi@qumranet.com> wrote:
but AFAICS rmap_write_protect() is only ever called if we write a newcr3 - hence a TLB flush will happen anyway, because we do avmcs_writel(GUEST_CR3, new_cr3). Am i missing something?
No, rmap_write_protect() is called whenever we shadow a new guest pagetable (the mechanism used to maintain coherency is write faults onpage tables).
hm. Where? I only see rmap_write_protect() called fromkvm_mmu_get_page(), which is only called from nonpaging_map() [but itdoesnt call the rmap function in that case, due to metaphysical], andmmu_alloc_roots(). mmu_alloc_roots() is only called from context init(init-only thing) or from paging_new_cr3().
ahh ... i missed paging_tmpl.h.

how about the patch below then?

Looks like a lot of complexity for very little gain. I'm not sure whatthe vmwrite cost is, cut it can't be that high compared to vmexit.

I think there are two better options:

1. Find out if tlb_flush() can be implemented as a no-op on intel --I'm fairly sure it can.2. If not, implement tlb_flush() as a counter increment like on amd.Then, successive tlb flushes and context switches are folded into one.

furthermore, shouldnt we pass in the flush area size from the pagefaulthandler? In most cases it would be 4096 bytes, that would mean an invlpgis enough, not a full cr3 flush. Although ... i dont know how to invlpga guest-side TLB. On VMX it probably doesnt matter at all. On SVM, isthere a way (or a need) to flush a single TLB of another context ID?

Yes (and yes), invlpga.

A complication is that a single shadow pte can be used to map multipleguest virtual addresses (by mapping the page table using multiple pdes),so the kvm_mmu_page object has to keep track of the page table gva, andwhether it is multiply mapped or not. I plan to do that later.

	Ingo

Index: linux/drivers/kvm/mmu.c
===================================================================
--- linux.orig/drivers/kvm/mmu.c
+++ linux/drivers/kvm/mmu.c

@@ -378,7 +378,7 @@ static void rmap_remove(struct kvm_vcpu}

-static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn)

+static void rmap_write_protect(struct kvm_vcpu *vcpu, u64 gfn, int flush_tlb)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct page *page;
@@ -404,7 +404,13 @@ static void rmap_write_protect(struct kv
 		BUG_ON(!(*spte & PT_WRITABLE_MASK));
 		rmap_printk("rmap_write_protect: spte %p %llx\n", spte, *spte);
 		rmap_remove(vcpu, spte);
-		kvm_arch_ops->tlb_flush(vcpu);
+		/*
+		 * While we removed a mapping there's no need to explicitly
+		 * flush the TLB here if we write a new cr3. It is needed
+		 * from the pagefault path though.
+		 */
+		if (flush_tlb)
+			kvm_arch_ops->tlb_flush(vcpu);
 		*spte &= ~(u64)PT_WRITABLE_MASK;
 	}
 }
@@ -561,7 +567,8 @@ static struct kvm_mmu_page *kvm_mmu_get_
 					     gva_t gaddr,
 					     unsigned level,
 					     int metaphysical,
-					     u64 *parent_pte)
+					     u64 *parent_pte,
+					     int flush_tlb)
 {
 	union kvm_mmu_page_role role;
 	unsigned index;
@@ -597,7 +604,7 @@ static struct kvm_mmu_page *kvm_mmu_get_
 	page->role = role;
 	hlist_add_head(&page->hash_link, bucket);
 	if (!metaphysical)
-		rmap_write_protect(vcpu, gfn);
+		rmap_write_protect(vcpu, gfn, flush_tlb);
 	return page;
 }

@@ -764,7 +771,7 @@ static int nonpaging_map(struct kvm_vcpu

 				>> PAGE_SHIFT;
 			new_table = kvm_mmu_get_page(vcpu, pseudo_gfn,
 						     v, level - 1,
-						     1, &table[index]);
+						     1, &table[index], 0);
 			if (!new_table) {
 				pgprintk("nonpaging_map: ENOMEM\n");
 				return -ENOMEM;
@@ -838,7 +845,7 @@ static void mmu_alloc_roots(struct kvm_v

ASSERT(!VALID_PAGE(root));

 		page = kvm_mmu_get_page(vcpu, root_gfn, 0,
-					PT64_ROOT_LEVEL, 0, NULL);
+					PT64_ROOT_LEVEL, 0, NULL, 0);
 		root = page->page_hpa;
 		++page->root_count;
 		vcpu->mmu.root_hpa = root;
@@ -857,7 +864,7 @@ static void mmu_alloc_roots(struct kvm_v
 			root_gfn = 0;
 		page = kvm_mmu_get_page(vcpu, root_gfn, i << 30,
 					PT32_ROOT_LEVEL, !is_paging(vcpu),
-					NULL);
+					NULL, 0);
 		root = page->page_hpa;
 		++page->root_count;
 		vcpu->mmu.pae_root[j][i] = root | PT_PRESENT_MASK;
Index: linux/drivers/kvm/paging_tmpl.h
===================================================================
--- linux.orig/drivers/kvm/paging_tmpl.h
+++ linux/drivers/kvm/paging_tmpl.h
@@ -245,7 +245,7 @@ static u64 *FNAME(fetch)(struct kvm_vcpu
 			table_gfn = walker->table_gfn[level - 2];
 		}
 		shadow_page = kvm_mmu_get_page(vcpu, table_gfn, addr, level-1,
-					       metaphysical, shadow_ent);
+					       metaphysical, shadow_ent, 1);
 		shadow_addr = shadow_page->page_hpa;
 		shadow_pte = shadow_addr | PT_PRESENT_MASK | PT_ACCESSED_MASK
 			| PT_WRITABLE_MASK | PT_USER_MASK;


--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Ingo Molnar <mingo@elte.hu>

References:
- [announce] [patch] KVM paravirtualization for Linux
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Avi Kivity <avi@qumranet.com>
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Avi Kivity <avi@qumranet.com>
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Ingo Molnar <mingo@elte.hu>
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Avi Kivity <avi@qumranet.com>
- Re: [announce] [patch] KVM paravirtualization for Linux
  - From: Ingo Molnar <mingo@elte.hu>

Prev by Date: Re: [PATCH] include/linux/slab.h: new KFREE() macro.
Next by Date: [ANNOUNCE] Guilt 0.17
Previous by thread: Re: [announce] [patch] KVM paravirtualization for Linux
Next by thread: Re: [announce] [patch] KVM paravirtualization for Linux
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]