Ingo,
After searching all over to find out where the slowness is, I finally
discovered it. It's the SLOB!
I noticed a that a make install of the kernel over NFS took on
2.6.15-rc5 ~26 seconds to complete, and on 2.6.15-rc5-rt2 it took almost
2 minutes for the same operation on the same machine.
I added my logdev device to record lots of output, and it found the
place that is taking the longest:
In 2.6.15-rc5-rt2:
[ 789.171773] cpu:0 kfree_skbmem:291 in
[ 789.171873] cpu:0 kfree_skbmem:295 1
[ 789.172357] cpu:0 kfree_skbmem:320 out
in 2.6.15-rc5:
[ 343.253988] cpu:0 kfree_skbmem:291 in
[ 343.253990] cpu:0 kfree_skbmem:295 1
[ 343.253991] cpu:0 kfree_skbmem:320 out
Here's the code for both systems (they are identical here):
void kfree_skbmem(struct sk_buff *skb)
{
struct sk_buff *other;
atomic_t *fclone_ref;
edprint("in");
skb_release_data(skb);
switch (skb->fclone) {
case SKB_FCLONE_UNAVAILABLE:
edprint("1");
kmem_cache_free(skbuff_head_cache, skb);
break;
case SKB_FCLONE_ORIG:
edprint("2");
fclone_ref = (atomic_t *) (skb + 2);
if (atomic_dec_and_test(fclone_ref))
kmem_cache_free(skbuff_fclone_cache, skb);
break;
case SKB_FCLONE_CLONE:
edprint("3");
fclone_ref = (atomic_t *) (skb + 1);
other = skb - 1;
/* The clone portion is available for
* fast-cloning again.
*/
skb->fclone = SKB_FCLONE_UNAVAILABLE;
if (atomic_dec_and_test(fclone_ref))
kmem_cache_free(skbuff_fclone_cache, other);
break;
};
edprint("out");
}
My edprint records in a ring buffer (much like relayfs), and produces
the above output. The time in brackets is in seconds. We see the
difference between "1" and "out" is greatly different. (Note, I have a
edprint in all interrupts, so I would know if one was taken, and these
times are not a one time deal, but show up like this every time).
So for 2.6.15-rc5 the time difference is 343.253991 - 343.253990 or
1 usec, where as the time for 2.6.15-rc5-rt2 is 789.172357 - 789.171873
or 484 usecs! We're talking about a 48,400% increase here!
The difference here is that the kmem_cache_free in rt is a SLOB where as
the vanilla kernel still uses SLABs, What's the rational for SLOB now?
The patches used are here:
For the logdev device:
http://www.kihontech.com/logdev/logdev-2.6.15-rc5-rt2.patch
http://www.kihontech.com/logdev/logdev-2.6.15-rc5.patch
For debugging (on top of logdev):
http://www.kihontech.com/logdev/debug-2.6.15-rc5-rt2.patch
http://www.kihontech.com/logdev/debug-2.6.15-rc5.patch
(I also added my patches previously posted to get it to compile and
handle the softirq hrtimer problems).
Thanks,
-- Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]