On Mon, 30 May 2005, dean gaudet wrote:
> On Mon, 30 May 2005, Benjamin LaHaise wrote:
>
> > Below is a patch that uses 128 bit SSE instructions for copy_page and
> > clear_page. This is an improvement on P4 systems as can be seen by
> > running the test program at http://www.kvack.org/~bcrl/xmm64.c to get
> > results like:
>
> it looks like the patch uses SSE2 instructions (pxor, movdqa, movntdq)...
> if you use xorps, movaps, movntps then it works on SSE processors as well.
oh and btw... on x86-64 you might want to look at using movnti with 64-bit
registers... the memory datapath on these processors is actually 64-bits
wide, and the 128-bit stores are broken into two 64-bit pieces internally
anyhow. the advantage of using movnti over movntdq/movntps is that you
don't have to save/restore the xmm register set.
-dean
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]