Re: Kernel panic: Route cache, RCU, possibly FIB trie.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Thu, 23 Mar 2006, Jesper Dangaard Brouer wrote:

On Tue, 21 Mar 2006, David S. Miller wrote:

From: Jesper Dangaard Brouer <[email protected]>
Date: Tue, 21 Mar 2006 15:51:34 +0100 (CET)

You guessed right... I did enable IP_ROUTE_MULTIPATH_CACHED, I have
now disabled it and equal multi path routing in general
(CONFIG_IP_ROUTE_MULTIPATH).

It is almost certainly the cause of your crashes, that code
is still extremely raw and that's why it is listed as "EXPERIMENTAL".

It seems your are right :-) (and I'll take more care of using experimental code on production again). The machine, has now been running for 34 hours without crashing. The strange thing is that I'm running the same kernel on 30 other (similar) machines, which have not crashed. (I do suspect the specific traffic load pattern to influence this)

Argh!! -- nemesis!!! The machine, just died again...
The machine did not crash it just ran out of memory, and killed too many important processes. I had to power recycle it... :-((( Could ping it...

I can see that, the traffic pattern have changed and the route cache is growing rapitly...


BUT, I do think I have noticed another problem in the garbage collection code (route.c), that causes the garbage collector (almost) never to garbage collect.

This is caused by the value "ip_rt_max_size" (/proc/sys/net/ipv4/route/max_size) being set too large. It is set to 16 times the gc_thresh value (this size dependend on the memory size). In the garbage collection function (rt_garbage_collect) garbage collecting entries are ignored (gc_ignored) if the number of entries are below "ip_rt_max_size".

With 1Gb memory, gc_thresh=65536 times 16 is 1048576. Which means that we only start to garbage collect when there is more than 1 million entries. This seems wrong... (the reason it does not grow this large is the 600 second periodic flushes).


Hilsen
Jesper Brouer

--
-------------------------------------------------------------------
Cand. scient datalog
Dept. of Computer Science, University of Copenhagen
-------------------------------------------------------------------


grep . /proc/sys/net/ipv4/route/*
/proc/sys/net/ipv4/route/error_burst:5000
/proc/sys/net/ipv4/route/error_cost:1000
grep: /proc/sys/net/ipv4/route/flush: Operation not permitted
/proc/sys/net/ipv4/route/gc_elasticity:8
/proc/sys/net/ipv4/route/gc_interval:60
/proc/sys/net/ipv4/route/gc_min_interval:0
/proc/sys/net/ipv4/route/gc_min_interval_ms:500
/proc/sys/net/ipv4/route/gc_thresh:65536
/proc/sys/net/ipv4/route/gc_timeout:300
/proc/sys/net/ipv4/route/max_delay:10
/proc/sys/net/ipv4/route/max_size:1048576
/proc/sys/net/ipv4/route/min_adv_mss:256
/proc/sys/net/ipv4/route/min_delay:2
/proc/sys/net/ipv4/route/min_pmtu:552
/proc/sys/net/ipv4/route/mtu_expires:600
/proc/sys/net/ipv4/route/redirect_load:20
/proc/sys/net/ipv4/route/redirect_number:9
/proc/sys/net/ipv4/route/redirect_silence:20480
/proc/sys/net/ipv4/route/secret_interval:600



Hilsen
  Jesper Brouer

--
-------------------------------------------------------------------
Cand. scient datalog
Dept. of Computer Science, University of Copenhagen
-------------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux