On 27/12/06, Ingo Molnar <[email protected]> wrote:
* Ingo Molnar <[email protected]> wrote:
> The pure per-CPU design would have to embedd the CPU ID the object is
> attached to into the allocated object. If that is not feasible then
> only the global hash remains i think.
embedding the info shouldnt be /that/ hard in case of the SLAB: if the
memleak info is at a negative offset from the allocated pointer. I.e.
that if kmalloc() returns 'ptr', the memleak info could be at
ptr-sizeof(memleak_info). That way you dont have to know the size of the
object beforehand and there's absolutely no need for a global hash of
It would probably need to be just a pointer embedded in the allocated
block. With the current design, the memleak objects have a lifetime
longer than the tracked block. This is mainly to avoid long locking
during memory scanning and reporting.
(it gets a bit more complex for page aligned allocations for the buddy
and for vmalloc - but that could be solved by adding one extra pointer
into struct page. [...]
This still leaves the issue of marking objects as not being leaks or
being of a different type. This is done by calling memleak_* functions
at the allocation point (outside allocator) where only the pointer is
known. In the vmalloc case, it would need to call find_vm_area. This
might not be a big problem, only that memory resources are no longer
treated in a unified way by kmemleak (and might not be trivial to add
support for new allocators).
[...] That is a far more preferable cost than the
locking/cache overhead of a global hash.)
A global hash would need to be re-built for every scan (and destroyed
afterwards), making this operation longer since the pointer values
together with their aliases (resulted from using container_of) are
added to the hash.
I understand the benefits but I personally favor simplicity over
performance, especially for code used as a debugging tool. DEBUG_SLAB
already introduces an overhead by poisoning the allocated blocks.
Generating the backtrace and filling in the memleak objects for every
allocation is another overhead. Global structures are indeed a
scalability problem but for a reasonable number of CPUs their overhead
might not be that big. Anyway, I can't be sure without some
benchmarking and this is probably highly dependent on the system
(caches, snoop control unit etc.).
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Video 4 Linux]
[Linux for the blind]