Re: [RFC] x86-64: Use SSE for copy_page and clear_page

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wednesday 01 June 2005 10:22, [email protected] wrote:
> Andi Kleen <[email protected]> writes:
> 
> > > Thus with "normal" page clear and "nt" page copy routines
> > > both clear and copy benchmarks run faster than with
> > > stock kernel, both with small and large working set.
> > > 
> > > Am I wrong?
> > 
> > fork is only a corner case. The main case is a process allocating
> > memory using brk/mmap and then using it.
> 
> Key point: "using it". This normally involves writes to memory. Most
> applications don't commonly read memory that they haven't previously
> written to. (valgrind et al call that behaviour a "bug" :).
> 
> Given that, I'd say you really don't want the page zero routines
> touching the cache.

Heh, good point.

However, it is valid only if program writes in every byte in a cacheline.
Then sufficiently smart CPU may avoid reading from main RAM.
(I am not sure that today's CPUs are smart enough. K6s were not)

If you have even one uninitialized byte (struct padding, etc) 
between bytes you write, CPU will have to do reads from main memory
in order to have cachelines with fully valid data.

Kernel compile did finish faster with nt stores, tho...
--
vda

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux