On Wednesday 01 June 2005 09:22, Denis Vlasenko wrote: > On Tuesday 31 May 2005 16:59, Benjamin LaHaise wrote: > > On Tue, May 31, 2005 at 11:23:58AM +0200, Andi Kleen wrote: > > > fork is only a corner case. The main case is a process allocating > > > memory using brk/mmap and then using it. > > I did the tests. I confirm Andi's conclusion that > if you are going to use cleared/copied page immediately, > nt stores are a loss. For anyone interested, here are the results and tarball with patch which allows to switch routines on the fly, plus test programs/scripts. You need to apply the patch, compile programs (see testing/mk) and run (see testing/m). Zero 1 page (4k), x300000 times: ./zero1 4 300000 method............. times.................. clears/copies slow: 0m2.094 0m2.143 0m2.143 1800225/332 mmx_APn: 0m1.931 0m2.036 0m2.053 1800207/316 mmx_APN: 0m3.458 0m3.476 0m3.487 1800212/335 mmx_APn/APN: 0m2.023 0m2.057 0m2.225 1800205/297 n - normal stores, N - nt stores, n/N - nt stores for copy only Zero 100 pages, x3000 times (zero all pages, read back all pages): ./zero1 400 3000 slow: 0m2.861 0m2.865 0m2.958 909218/323 mmx_APn: 0m2.626 0m2.715 0m2.783 909208/307 mmx_APN: 0m2.240 0m2.249 0m2.251 909212/294 mmx_APn/APN: 0m2.715 0m2.752 0m2.761 909220/324 Zero 100 pages, x3000 times (zero page, read back page, repeat for all pages): ./zero2 400 3000 slow: 0m1.711 0m1.734 0m1.739 909211/306 mmx_APn: 0m1.548 0m1.585 0m1.616 909208/303 mmx_APN: 0m2.088 0m2.102 0m2.104 909203/287 mmx_APn/APN: 0m1.589 0m1.603 0m1.617 909202/288 Andi is right. If we read pages back immediately, we lose. If we defer that, we win. Fork and copy 1 page, x150000 times ./copy1 4 150000 slow: 0m4.464 0m4.487 0m4.495 1350240/900353 mmx_APn: 0m3.979 0m4.090 0m4.191 1350229/900327 mmx_APN: 0m5.685 0m5.722 0m5.744 1350229/900348 mmx_APn/APN: 0m4.730 0m4.826 0m4.975 1350221/900340 Fork and copy 100 pages, x1500 times (copy all pages, read back all copied pages): ./copy1 400 1500 slow: 0m2.561 0m2.568 0m2.576 14019/454824 mmx_APn: 0m2.230 0m2.237 0m2.242 14011/454804 mmx_APN: 0m1.841 0m1.865 0m1.876 14008/454791 mmx_APn/APN: 0m1.906 0m1.909 0m1.922 14010/454788 Fork and copy 100 pages, x1500 times (copy page, read back copied page, repeat for each page): ./copy2 400 1500 slow: 0m2.097 0m2.107 0m2.121 14020/454821 mmx_APn: 0m1.769 0m1.776 0m1.788 14008/454787 mmx_APN: 0m1.975 0m1.978 0m2.015 14035/454791 mmx_APn/APN: 0m1.979 0m2.009 0m2.017 14013/454810 -- vda
Attachment:
t.tar.bz2
Description: application/tbz
- References:
- [RFC] x86-64: Use SSE for copy_page and clear_page
- From: Benjamin LaHaise <[email protected]>
- Re: [RFC] x86-64: Use SSE for copy_page and clear_page
- From: Benjamin LaHaise <[email protected]>
- Re: [RFC] x86-64: Use SSE for copy_page and clear_page
- From: Denis Vlasenko <[email protected]>
- [RFC] x86-64: Use SSE for copy_page and clear_page
- Prev by Date: Re: [linux-pm] Re: [PATCH] Don't explode on swsusp failure to find swap
- Next by Date: PATCH 2.4.31: update pci.ids 1of2
- Previous by thread: Re: [RFC] x86-64: Use SSE for copy_page and clear_page
- Next by thread: Re: [RFC] x86-64: Use SSE for copy_page and clear_page
- Index(es):