Re: i386 spinlock fairness: bizarre test results

Chuck Ebbert a écrit :

  After seeing Kirill's message about spinlocks I decided to do my own
testing with the userspace program below; the results were very strange.

  When using the 'mov' instruction to do the unlock I was able to reproduce
hogging of the spinlock by a single CPU even on Pentium II under some
conditions, while using 'xchg' always allowed the other CPU to get the
lock:


parent CPU 1, child CPU 0, using mov instruction for unlock
parent did: 34534 of 10000001 iterations
CPU clocks:     2063591864

parent CPU 1, child CPU 0, using xchg instruction for unlock
parent did: 5169760 of 10000001 iterations
CPU clocks:     2164689784

Hi Chuck

Thats interesting, because on my machines I get opposite results :

On a Xeon 2.0 GHz, Hyper Threading, I get better results (in the sense offairness) with the MOV version


   parent CPU 1, child CPU 0, using mov instruction for unlock
   parent did: 5291346 of 10000001 iterations
   CPU clocks:     3.437.053.244

   parent CPU 1, child CPU 0, using xchg instruction for unlock
   parent did: 7732349 of 10000001 iterations
   CPU clocks:     3.408.285.308

On a dual Opteron 248 (2.2GHz) machine, I get bad fairness results regardlessof MOV/XCHG (but less cycles per iteration). A lot of variations in theresults (maybe because of NUMA effects ?), but xchg gives slightly betterfairness.


   parent CPU 1, child CPU 0, using mov instruction for unlock
   parent did: 256810 of 10000001 iterations
   CPU clocks:      772.838.640

   parent CPU 1, child CPU 0, using xchg instruction for unlock
   parent did: 438280 of 10000001 iterations
   CPU clocks:     1.115.653.346

   parent CPU 1, child CPU 0, using xchg instruction for unlock
   parent did: 574501 of 10000001 iterations
   CPU clocks:     1.200.129.428


On a dual core Opteron 275 (2.2GHz), xchg is faster but unfair.

(threads running on the same physical CPU)
   parent CPU 1, child CPU 0, using mov instruction for unlock
   parent did: 4822270 of 10000001 iterations
   CPU clocks:      738.438.383

   parent CPU 1, child CPU 0, using xchg instruction for unlock
   parent did: 9702075 of 10000001 iterations
   CPU clocks:      561.457.724

Totally different results if affinity changed so that threads run on differentphysical cpus : (XCHG is slower)


   parent CPU 0, child CPU 2, using mov instruction for unlock
   parent did: 1081522 of 10000001 iterations
   CPU clocks:      508.611.273

   parent CPU 0, child CPU 2, using xchg instruction for unlock
   parent did: 4310427 of 10000000 iterations
   CPU clocks:     1.074.170.246

For reference, if only one thread is running, the MOV version is also fasteron both platforms :


[Xeon 2GHz]
   one thread, using mov instruction for unlock
   10000000 iterations
   CPU clocks:     1.278.879.528

[Xeon 2GHz]
   one thread, using xchg instruction for unlock
   10000000 iterations
   CPU clocks:     2.486.912.752

Of course Opterons are faster :) (less cycles per iteration)

[Opteron 248]
   one thread, using mov instruction for unlock
   10000000 iterations
   CPU clocks:      212.514.637

[Opteron 248]
   one thread, using xchg instruction for unlock
   10000000 iterations
   CPU clocks:      383.306.420

[Opteron 275]
   one thread, using mov instruction for unlock
   10000000 iterations
   CPU clocks:      208.472.009

[Opteron 275]
   one thread, using xchg instruction for unlock
   10000000 iterations
   CPU clocks:      417.502.675

In conclusion, I would say that for uncontended locks, MOV is faster. Forcontended locks, XCHG might be faster on some platforms (dual core Opterons,only if on the same physical CPU)


Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- i386 spinlock fairness: bizarre test results
  - From: Chuck Ebbert <[email protected]>

Prev by Date: Kconfig question
Next by Date: Re: [PATCH] ppc highmem fix
Previous by thread: i386 spinlock fairness: bizarre test results
Next by thread: Re: i386 spinlock fairness: bizarre test results
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]