* Andrew Morton <[email protected]> wrote:
> > > spin_lock is still uninlined.
> >
> > yes, and that should stay so i believe, for text size reasons. The BTB
> > should eliminate most effects of the call+ret itself.
>
> The old
>
> lock; decb
> js <different section>
> ...
>
> was pretty good.
yes, but that's 4-7+6==10-13 bytes of inline footprint, compared to
fixed 5 bytes. That gives quite some icache footprint with thousands of
call sites.
> > hm, with my patch, write_unlock should be inlined too.
>
> So it is. foo_unlock_irq() isn't though.
yes, that one should probably be inlined too, it's just 1 byte longer,
still the network-effects on register allocations give a net win:
text data bss dec hex filename
4072031 858208 387196 5317435 51233b vmlinux-smp-uninlined
4060671 858212 387196 5306079 50f6df vmlinux-smp-inlined
4058543 858212 387196 5303951 50ee8f vmlinux-irqop-inlined-too
another 0.05% drop in text size. Add-on patch below, it is against -rc5
plus prev_spinlock_patch. Boot-tested on 4-way x86 SMP. The box crashed
and burned. Joking.
Ingo
include/linux/spinlock.h | 16 +++++++++++++---
1 files changed, 13 insertions(+), 3 deletions(-)
Index: linux/include/linux/spinlock.h
===================================================================
--- linux.orig/include/linux/spinlock.h
+++ linux/include/linux/spinlock.h
@@ -184,19 +184,29 @@ extern int __lockfunc generic__raw_read_
# define write_unlock(lock) __raw_write_unlock(&(lock)->raw_lock)
#endif
+#if defined(CONFIG_DEBUG_SPINLOCK) || defined(CONFIG_PREEMPT)
+# define spin_unlock_irq(lock) _spin_unlock_irq(lock)
+# define read_unlock_irq(lock) _read_unlock_irq(lock)
+# define write_unlock_irq(lock) _write_unlock_irq(lock)
+#else
+# define spin_unlock_irq(lock) \
+ do { __raw_spin_unlock(&(lock)->raw_lock); local_irq_enable(); } while (0)
+# define read_unlock_irq(lock) \
+ do { __raw_read_unlock(&(lock)->raw_lock); local_irq_enable(); } while (0)
+# define write_unlock_irq(lock) \
+ do { __raw_write_unlock(&(lock)->raw_lock); local_irq_enable(); } while (0)
+#endif
+
#define spin_unlock_irqrestore(lock, flags) \
_spin_unlock_irqrestore(lock, flags)
-#define spin_unlock_irq(lock) _spin_unlock_irq(lock)
#define spin_unlock_bh(lock) _spin_unlock_bh(lock)
#define read_unlock_irqrestore(lock, flags) \
_read_unlock_irqrestore(lock, flags)
-#define read_unlock_irq(lock) _read_unlock_irq(lock)
#define read_unlock_bh(lock) _read_unlock_bh(lock)
#define write_unlock_irqrestore(lock, flags) \
_write_unlock_irqrestore(lock, flags)
-#define write_unlock_irq(lock) _write_unlock_irq(lock)
#define write_unlock_bh(lock) _write_unlock_bh(lock)
#define spin_trylock_bh(lock) __cond_lock(_spin_trylock_bh(lock))
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
- Re: [PATCH] i386 spinlocks should use the full 32 bits, not only 8 bits
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]