Re: [patch 04/11] mutex subsystem, add include/asm-x86_64/mutex.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ingo Molnar a écrit :
add the x86_64 version of mutex.h, optimized in assembly.
+/**
+ * __mutex_fastpath_lock - decrement and call function if negative
+ * @v: pointer of type atomic_t
+ * @fn: function to call if the result is negative
+ *
+ * Atomically decrements @v and calls <fn> if the result is negative.
+ */
+#define __mutex_fastpath_lock(v, fn_name)				\
+do {									\
+	/* type-check the function too: */				\
+	fastcall void (*__tmp)(atomic_t *) = fn_name;			\
+	unsigned long dummy;						\
+									\
+	(void)__tmp;							\
+	typecheck(atomic_t *, v);					\
+									\
+	__asm__ __volatile__(						\
+		LOCK	"   decl (%%rdi)	\n"			\
+			"   js 2f		\n"			\
+			"1:			\n"			\
+									\
+		LOCK_SECTION_START("")					\
+			"2: call "#fn_name"	\n"			\
+			"   jmp 1b		\n"			\
+		LOCK_SECTION_END					\
+									\
+		:"=D" (dummy)						\
+		: "D" (v)						\
+		: "rax", "rsi", "rdx", "rcx",				\
+		  "r8", "r9", "r10", "r11", "memory");			\
+} while (0)

Hi Ingo

I do think this assembly is not very fair.
It has an *insane* register pressure for the compiler :
The fast path is thus not so fast.

Compare with the include/asm-x86_64/semaphore.h
        __asm__ __volatile__(
                "# atomic down operation\n\t"
                LOCK "decl %0\n\t"     /* --sem->count */
                "js 2f\n"
                "1:\n"
                LOCK_SECTION_START("")
                "2:\tcall __down_failed\n\t"
                "jmp 1b\n"
                LOCK_SECTION_END
                :"=m" (sem->count)
                :"D" (sem)
                :"memory");

Only one register is mandatory (%rdi), instead of nine.

Two solutions :

(This one has no register constraint, but the slowpath is 'long')
It also requires the mutex is not on the stack.

#define PUSH_SCRATCH "push %%rdi; push %%rax; push %%rsi;"  \
		     "push %%rdx; push %%rcx; push %%r8;" \
		     "push %%r9; push %%r10; push %%r11\n"

#define POP_SCRATCH  "pop %%r11; pop %%r10; pop %%r9;"    \
	             "pop %%r8; pop %%rcx; pop %%rdx;"     \
                     "pop %%rsi; pop %%rax"; pop %%rdi\n"

 	__asm__ __volatile__(					\
 		LOCK	"   decl %0	\n"		\
 			"   js 2f		\n"		\
 			"1:			\n"		\
 								\
 		LOCK_SECTION_START("")				\
                       "2:" PUSH_SCRATCH                        \
			"lea %0,%%rdi            \n"            \
 			"call "#fn_name"	\n"             \
			POP_SCRATCH				\
 			"   jmp 1b		\n"		\
 		LOCK_SECTION_END				\
 								\
 		:"=m" (v->count)					\
 		: "m" (v->count)					\
 		: "memory");			\


Or call a wrapper that does the PUSH/POP thing.

Thank you
Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux