Re: RCU latency regression in 2.6.16-rc1

Dipankar Sarma a écrit :

On Sat, Jan 28, 2006 at 01:51:23PM -0500, Lee Revell wrote:

On Sat, 2006-01-28 at 13:00 -0500, Lee Revell wrote:

OK, now we are making progress.

I spoke too soon, it's not fixed:

preemption latency trace v1.1.5 on 2.6.16-rc1
--------------------------------------------------------------------
 latency: 4183 us, #3676/3676, CPU#0 | (M:rt VP:0, KP:0, SP:0 HP:0)
    -----------------
evolutio-2877  0d.s.   97us : local_bh_enable (rt_run_flush)
evolutio-2877  0d.s.   98us : local_bh_enable (rt_run_flush)
evolutio-2877  0d.s.   99us : local_bh_enable (rt_run_flush)
evolutio-2877  0d.s.  100us : local_bh_enable (rt_run_flush)
evolutio-2877  0d.s.  101us : local_bh_enable (rt_run_flush)

[ etc ]

evolutio-2877 0d.s. 4079us : local_bh_enable (rt_run_flush)

evolutio-2877  0d.s. 4080us : local_bh_enable (rt_run_flush)

I am not sure if I am interpreting the latency trace right,
but it seems that there is a difference between the problem
you were seeing earlier and now.

In one of your earlier traces, I saw  -

  <idle>-0     0d.s.  182us : dst_destroy (dst_rcu_free)
  <idle>-0     0d.s.  183us : ipv4_dst_destroy (dst_destroy)

[ etc - zillions of dst_rcu_free()s deleted ]

  <idle>-0     0d.s. 13403us : dst_rcu_free (__rcu_process_callbacks)
  <idle>-0     0d.s. 13403us : dst_destroy (dst_rcu_free)

This points to latency increase caused by lots and lots of
RCU callbacks doing dst_rcu_free(). Do you still see those ?

Your new trace shows that we are held up in in rt_run_flush().I guess we need to investigate why we spend so much time in rt_run_flush(),

because of a big route table or the lock acquisitions.

Some machines have millions of entries in their route cache.

I suspect we cannot queue all them (or only hash heads as your previous patch)by RCU. Latencies and/or OOM can occur.

What can be done is :

in rt_run_flush(), allocate a new empty hash table, and exchange the hash tables.

Then wait a quiescent/grace RCU period (may be the exact term is not this one,sorry, I'm not RCU expert)

Then free all the entries from the old hash table (direclty of course, no needfor RCU grace period), and free the hash table.

As the hash table can be huge, we might need allocate it at boot time, just incase a flush is needed (it usually is :) ). If we choose dynamic allocationand this allocation fails, then fallback to what is done today.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: RCU latency regression in 2.6.16-rc1
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Lee Revell <rlrevell@joe-job.com>

References:
- Re: RCU latency regression in 2.6.16-rc1
  - From: Ingo Molnar <mingo@elte.hu>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Lee Revell <rlrevell@joe-job.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Dipankar Sarma <dipankar@in.ibm.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Dipankar Sarma <dipankar@in.ibm.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Lee Revell <rlrevell@joe-job.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: "Paul E. McKenney" <paulmck@us.ibm.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Lee Revell <rlrevell@joe-job.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Dipankar Sarma <dipankar@in.ibm.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Lee Revell <rlrevell@joe-job.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Lee Revell <rlrevell@joe-job.com>
- Re: RCU latency regression in 2.6.16-rc1
  - From: Dipankar Sarma <dipankar@in.ibm.com>

Prev by Date: Re: memory leak in scsi_cmd_cache 2.6.15
Next by Date: Re: e100 oops on resume
Previous by thread: Re: RCU latency regression in 2.6.16-rc1
Next by thread: Re: RCU latency regression in 2.6.16-rc1
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]