Re: [PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch

Jeremy Fitzhardinge wrote:

Zachary Amsden wrote:

In shadow mode hypervisors, ptep_get_and_clear achieves the desired
purpose of keeping the shadows in sync by issuing a native_get_and_clear,
followed by a call to pte_update, which indicates the PTE has been
modified.

Direct mode hypervisors (Xen) have no need for this anyway, and will trap
the update using writable pagetables.

Jan Beulich just posted a patch for Xen/linux which specialzes
ptep_get_and_clear, which apparently improves kernel-compile performance
by 25-30% on some configurations.  I think we'll want to keep this.

Are you sure that it still wins even with these patches? I can't seeptep_get_and_clear getting much faster than a pure non-emulated, and thestress case which wins 25-30% is fork/exit being able to drop the pteupdates in zap_pte_range from trapping and / or going into a pte updatequeue. That is what my patches addressed for shadow paging performance- the call to pte_update can be dropped - but you should already begetting this benefit for Xen, since there pte_update is a nop. Are yousure you are unpinning the page tables and mapping them back as writablepages prior to address space destruction?

Or is there another case which exercises zap_pte_range via munmap? Inwhich case, you might be able to do things faster than trap and emulatewith a hypercall, but a 25-30% speedup sounds rather para-normal forthis type of change. I would like to see a patch and some basic datashowing the speedup if possible.

I certainly have no objection to adding back ptep_get_and_clear hook,but we need to fix the naming to be consistent (raw vs. native), whichis one thing the patch (dropping the hook) is trying to do. We canalways re-add specialization for ptep_get_and_clear; the question iswhat level is appropriate. Do you allow ptep_get_and_clear itself to beredefined or allow redefine of native_ptep_get_and_clear?

One line of reasoning - for consistency, it seems thatnative_ptep_get_and_clear should be just that, a non-paravirtualized pteudpate. Then it follows we should not allow native_ptep_get_and_clearto be overridden. In this case, for cleanliness is it better to add amidlevel indirection; use raw_ptep_get_and_clear as the paravirt-hook,which is called from ptep_get_and_clear, and allow access to thenon-paravirtualized native functions with native_ptep_get_and_clear. Ifthat sounds agreeable, then this patch should be backed out. I thinkbacking it out will damage the immediately following patches due tosensitivity from code proximity.

Maybe the best strategy is to just re-add ptep_get_and_clear as aparavirt-op after the whole set of patches and before Jan's speedup patch?

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch
  - From: Jeremy Fitzhardinge <jeremy@goop.org>

References:
- [PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch
  - From: Zachary Amsden <zach@vmware.com>
- Re: [PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch
  - From: Jeremy Fitzhardinge <jeremy@goop.org>

Prev by Date: Re: [patch] CFS (Completely Fair Scheduler), v2
Next by Date: Re: 4GB Physical. Less than 3GB in Linux.
Previous by thread: Re: [PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch
Next by thread: Re: [PATCH 1/4] Pte drop ptep_get_and_clear paravirt op.patch
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]