Re: [patch -mm] i386: use pte_update_defer in ptep_test_and_clear_{dirty,young}

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hugh Dickins wrote:
You're right to want to defer your pte_updates, David is right to want
to batch his TLB flushes.  It bothers me that you have a surprising case,
and that unless you abandon your optimization, it imposes a new constraint
on how to proceed in common code (without #ifdef'ing around).

But perhaps in this case David might concede that the longer we delay
the TLB flush, the more likely a referenced bit is to be missed - that is,
it gets cleared from the pte, but if that page is accessed again before
the TLB is flushed, the processor may well omit to reinstate the accessed
bit, and our stats drift away from reality.

Compromise patch below: would that be satisfactory to you, David?

Although I appreciate the heroics, you needn't do this on our account; the win of a couple thousand cycles is not worth the cost in complexity, IMHO, and the penalty on native quite potentially overshadows this. If you still issue the flush inside the spinlock, as required for this paravirt optimization, you are taking the risk of holding the spinlock an extra long time while issuing a TLB shootdown - which means waiting for an IPI.

It might not matter that much on i386, but on big iron (or realtime) systems, this could have significant negative scaling effects for workloads where the page page table was hot on some set of CPUs (say, remapping file pages for database access). In time, the benefits of this optimization to the hypervisor will decrease, while the benefits of optimizing the other way for shorter spinlock time may increase, both in a VM and on native hardware.

So I would rather just drop the pte_update_defer down to a pte_update if the flush is not immediately following - as it is nice and simply correct without getting in the way. I see there is more in this thread that I haven't read yet, so I preemptively reserve the right to issue an invalidation of this opinion...

Thanks,

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux