[patch 00/10] [RFC] SLUB patches for more functionality, performance and maintenance

This series here contains a number of patches that need some discussion or
evaluation. They apply on top of 2.6.22-rc6-mm1 + slub patches already
in mm.

1. Page allocator pass through

SLOB does pass through all larger kmalloc requests directly to the page
allocator. The advantage is that the allocator overhead is eliminated
and that we do not need to provide kmalloc slabs >= PAGE_SIZE.
This patch does the same thing in SLUB.

For allocation whose size is know the call to the slab may be converted to
a call to the page allocator at compile time.

The result of doing so is also that the behavior of SLUB for pagesize
and higher kmalloc allocation will conform to SLAB. Meaning large
kmalloc allocations are no longer debuggable and are guaranteed to be
page aligned.

Do we want this?

2. A series of performance enhancements patches

The patches improve the producer / consumer scenario. If objects are
always allocated on one processor and released on another then both
will use distinct cachelines to store their information in order to
avoid a bouncing cacheline.

In order to do so we have to introduce a per cpu structure to keep
per cpu allocation lists in distinct cachelines from the remote free
information in the page struct. If we introduce a per cpu structure
then we also need to allocate that in a NUMA aware fashion from the
local node.

Having a per cpu structure allows to avoid the use of certain fields
in the page struct which in turn allows us to avoid using page->mapping
and increasing the maximum number of objects per slab. More optimizations
become possible by shifting information from the kmem_cache structure
that is used in the hotpath to the per cpu structure thereby minimizing
cacheline coverage.

Finally there is an implementation of slab_alloc/slab_free using a cmpxchg
instead of current interrupt enable disable approach. This was inspired by
LTTng's approach. A cmpxchg is less costly than interrupt enabe/disable
but means more complexity in managing the resulting race conditions.
The disadvantage is that the allocation / free paths become very
complex and fragile.

All of these patches need to be evaluated as to what impact they
have on a variety of loads.

3. Removal of SLOB and SLAB.

We would like to consolidate and only have one slab allocator in the future.
Two patches are included that remove SLOB and SLAB. There is only minimal
justification for retaining SLOB. So I think we could remove SLOB for 2.6.23.

SLAB is the reference that SLUB must be measured against to avoid regressions.
On the other hand it will be a problem to support new functionality like
Slab defragmentation since its design makes it difficult to implement comparable
features. So I think that we need to keep SLAB around for one more cycle and
then we may be able to get rid of it. Or we can keep it in the tree for
awhile and produce more and more shims for new slab functionality.

-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [patch 00/10] [RFC] SLUB patches for more functionality, performance and maintenance
  - From: Andi Kleen <[email protected]>
- Re: [patch 00/10] [RFC] SLUB patches for more functionality, performance and maintenance
  - From: David Miller <[email protected]>
- [patch 10/10] Remove slab in 2.6.24
  - From: Christoph Lameter <[email protected]>
- [patch 05/10] SLUB: Avoid touching page struct when freeing to per cpu slab
  - From: Christoph Lameter <[email protected]>
- [patch 09/10] Remove the SLOB allocator for 2.6.23
  - From: Christoph Lameter <[email protected]>
- [patch 08/10] SLUB: Single atomic instruction alloc/free using cmpxchg
  - From: Christoph Lameter <[email protected]>
- [patch 06/10] SLUB: Place kmem_cache_cpu structures in a NUMA aware way.
  - From: Christoph Lameter <[email protected]>
- [patch 07/10] SLUB: Optimize cacheline use for zeroing
  - From: Christoph Lameter <[email protected]>
- [patch 02/10] SLUB: Avoid page struct cacheline bouncing due to remote frees to cpu slab
  - From: Christoph Lameter <[email protected]>
- [patch 04/10] SLUB: Move page->offset to kmem_cache_cpu->offset
  - From: Christoph Lameter <[email protected]>
- [patch 01/10] SLUB: Direct pass through of page size or higher kmalloc requests
  - From: Christoph Lameter <[email protected]>
- [patch 03/10] SLUB: Do not use page->mapping
  - From: Christoph Lameter <[email protected]>

Prev by Date: Re: cpufreq 'choice' Kconfig oddness in 2.6.22-rc6-mm1..
Next by Date: [patch 03/10] SLUB: Do not use page->mapping
Previous by thread: [patch 00/12] Slab defragmentation V4
Next by thread: [patch 03/10] SLUB: Do not use page->mapping
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]