* Christoph Lameter <[email protected]> wrote:
> > I think we should we make SLAB the default for v2.6.24 ...
>
> If you guarantee that all the regression of SLAB vs. SLUB are
> addressed then thats fine but AFAICT that is not possible.
huh? You got the ordering wrong ;-) SLUB needs to resolve all
regressions relative to SLAB. (or at least have a really good
explanation about why it regresses)
> Here is a list of some of the benefits of SLUB just in case we forgot:
>
> - SLUB is performance wise much faster than SLAB. This can be more than a
> factor of 10 (case of concurrent allocations / frees on multiple
> processors). See http://lkml.org/lkml/2007/10/27/245
which is of little help if it regresses on other workloads. As we've
seen it, SLUB can be more than 10 times slower on hackbench. You can
tune SLUB to use 2MB pages but of course that's not a production level
system. OTOH, have you tried to tune SLAB in the above benchmark?
> - Single threaded allocation speed is up to double that of SLAB
link?
> - Remote freeing of objectcs in a NUMA systems is typically 30% faster.
>
> - Debugging on SLAB is difficult. Requires recompile of the kernel
> and the resulting output is difficult to interpret. SLUB can apply
> debugging options to a subset of the slabcaches in order to allow
> the system to work with maximum speed. This is necessary to detect
> difficult to reproduce race conditions.
that's not a fundamental property of SLAB. It would be an about 10 lines
hack to enable SLAB debugging switchable-on runtime, with the boot flag
defaulting to 'off'.
> - SLAB can capture huge amounts of memory in its queues. The problem
> gets worse the more processors and NUMA nodes are in the system. The
> amount of memory limits the number of per cpu objects one can
> configure.
well that's the nature of caches, but it could be improved: restrict
alien caches along cpusets and demand-allocate them.
> - SLAB requires a pass through all slab caches every 2 seconds to
> expire objects. This is a problem both for realtime and MPI jobs
> that cannot take such a processor outage.
the moment you start capturing more memory in SLUB's per cpu queues
(which do exist), you will have the same sort of problem.
> - SLAB does not have a sophisticated slabinfo tool to report the
> state of slab objects on the system. Can provide details of object
> use.
again, not a fundamental property of SLAB.
> - SLAB requires the update of two words for freeing
> and allocation. SLUB can do that by updating a single word which
> allows to avoid enabling and disabling interrupts if the processor
> supports an atomic instruction for that purpose. This is important
> for realtime kernels where special measures may have to be
> implemented if one wants to disable interrupts.
i do appreciate that :-) SLUB was rather easy to "port" to PREEMPT_RT:
it did not need a single line of change. The SLAB portion is a lot
scarier:
dione:~linux-rt.q> diffstat patches/rt-slab-new.patch
1 file changed, 319 insertions(+), 177 deletions(-)
> - SLAB requires memory to be set aside for queues (processors
> times number of slabs times queue size). SLUB requires none of that.
>
> - SLUB merges slab caches with similar characteristics to
> reduce the memory footprint even further.
>
> - SLAB performs object level NUMA management which creates
> a complex allocator complexity. SLUB manages NUMA on the level of
> slab pages reducing object management overhead.
>
> - SLUB allows remote node defragmentation to avoid the buildup
> of large partial lists on a single node.
>
> - SLUB can actively reduce the fragmentation of slabs through
> slab cache specific callbacks (not merged yet)
>
> - SLUB has resiliency features that allow it to isolate a problem
> object and continue after diagnostics have been performed.
all of these are neat.
How about renaming it to SLAB2 instead of SLUB? The "unqueued" bit is
just stupid NIH syndrome. It's _of course_ queued because it has to. "It
does not have _THAT_ queue as SLAB used to have" is just a silly excuse.
> - SLUB creates rarely used DMA caches on demand instead of creating
> them all on bootup (SLAB).
actually, this might be a bug. the DMA caches should be created right
away and filled with a small amount of objects due to stupid 16MB
limitations with certain hardware. Later on a GFP_DMA request might not
be fulfillable. (because that zone is filled up pretty quickly)
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- References:
- 2.6.24-rc4-git5: Reported regressions from 2.6.23
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
- Re: tipc_init(), WARNING: at arch/x86/mm/highmem_32.c:52, [2.6.24-rc4-git5: Reported regressions from 2.6.23]
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]