* Catalin Marinas <[email protected]> wrote:
> On 13/06/06, Ingo Molnar <[email protected]> wrote:
> >
> >you should think of this in terms of a 'graph of data', where each node
> >is a block of memory. The edges between nodes are represented by
> >pointers. The graph roots from .data/bss, but it may go indefinitely
> >into dynamically allocated blocks as well - just think of a hash-list
> >where the hash list table is in .data, but all the chain entries are in
> >allocated blocks and the chaining can be arbitrarily deep.
> [...]
>
> Nice description, I should add it to the kmemleak doc :-)
feel free :-)
> >Currently kmemleak does not track the per-block position of 'outgoing
> >pointers': it assumes that all fields within a block may be an outgoing
> >pointer. This is a source of false negatives. (fields that do not
> >contain a real pointer might accidentally contain a value that is
> >interpreted as a false edge - falsely connecting a leaked block to the
> >graph.)
>
> That's correct but it might not be a real issue in practice. Many
> people use the Boehm's GC and haven't complained about the amount of
> false negatives (AFAIK, there is even a proposal to include it in the
> next C++ standard).
For a GC a false negative is no big problem - it will reduce the
efficiency of the GC a bit, but that's all. For leak detection, if we
happen to have a persistent false pointer in .data (or any other
persistently allocated memory), it may prevent the detection of a leak
permanently - at least for that bootup. Statistically it could still be
found on other systems, but it would be better to have a design that
will eventually lead to having no false negatives.
But it's not just about the amount of false negatives, but also about
the overhead of scanning. You are concentrated on embedded systems with
small RAM - but most of the testers will be running this with at last
1GB of RAM - which is _alot_ of memory to scan.
(But, if it's not possible to implement it in a sane manner then that's
not an issue either - it's rather the false positives that must be
avoided.)
> >Kmemleak does recognize 'incoming pointers' via the offsetof tracking
> >method, but it's limited in that it is not a type-accurate method
> >either: it tracks per-size offsets, so two types accidentally having the
> >same size merges their 'possible incoming pointer offset' lists, which
> >introduces false negatives. (a pointer may be considered an incoming
> >edge while in reality the pointer is not validly pointing into this
> >structure)
>
> The number of collisions would need to be investigated. On my system,
> there are 158 distinct sizeof values generated by container_of. Of
> this, 90 have at least two aliases (incoming pointer offsets). I'm not
> sure how many different structures are in the kernel but I can't find
> an easy (gcc magic) way to get a unique id for a type (apart from
> modifying all the container_of calls and add a 4th argument - maybe a
> string with the name of the type).
there are a couple of possibilities.
If the ID is string based then you dont even have to touch containr_of()
calls - just generate the typename string via the "#y" stringification
preprocessor directive, where 'y' is the second parameter of
container_of().
there's another, much faster solution as well that assigns IDs
build-time for globally visible types: the __builtin_types_compatible_p
gcc extension to match the type against a global registry of types. I.e.
here's what i use in PREEMPT_RT:
#undef TYPE_EQUAL
#define TYPE_EQUAL(lock, type) \
__builtin_types_compatible_p(typeof(lock), type *)
#define PICK_OP(type, optype, op, lock) \
do { \
if (TYPE_EQUAL((lock), type)) \
_raw_##optype##op((type *)(lock)); \
else if (TYPE_EQUAL(lock, spinlock_t)) \
_spin##op((spinlock_t *)(lock)); \
else __bad_spinlock_type(); \
} while (0)
so you can generate a (really) long branch that checks every known type
that assigns an ID to a type build-time:
if (TYPE_EQUAL(type, struct skb_head))
type_id = 1;
else if (TYPE_EQUAL(type, struct ))
type_id = 2;
...
else
type_id = UNKNOWN_TYPE;
despite this branch having hundreds of checks, the compiler will
eliminate it at build time and only a single static type ID assignment
remains.
this long branch could be auto-generated build-time (just like
asm-offsets.c) in a maintainable way by putting a single "register type"
line after every structure definition in global .h files:
REGISTER_TYPE(struct skb_head)
where REGISTER_TYPE(x) maps to nothing during normal kernel builds, but
if a special flag is set it generates the type string into a special
section:
#ifdef GENERATE_TYPE_REGISTRY
# define REGISTER_TYPE(x) \
static char x __attribute__((section(".type.registry")) = #x;
#endif
so you can build and execute a special utility early during kernel build
that prints out the generated code. (again, like asm-offsets.c)
it needs some thought, but this way it's quite possible to build-time
map types to IDs.
> >but that there is a capable annotation method to reduce the amount of
> >false negatives, in a gradual and managable way - down to zero if
> >everything is annotated.
>
> I'm not sure this could be achieved in a maintainable way, at least
> not without support from the compiler.
it's possible with gcc (for global types), just hidden a bit :-)
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]