PREEMPT_RT and I-PIPE: the numbers, take 3

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is the 3rd run of our tests.

Here are the changes since last time:

- Modified the IRQ latency measurement code on the logger to do a busy-
wait on the target instead of responding to an interrupt triggered by
the target's "reply". As Ingo had suggested, this very much replicates
what lpptest.c does. In fact, we actually copied Ingo's loop.

- LMbench runs are now conducted 5 times instead of just 3.

- The software versions being used were:
  2.6.12 - final
  RT-V0.7.50-35
  I-pipe v0.7


System load:
------------

As in the two first runs, total system load is measured by LMbench's
execution time under various system loads. This time, however, each test
was run 5 times instead of 3. While we would like to consider these
tests to be generally more representative of general system behavior than
previous published results, it remains that measuring the total running
time of something like LMbench only gives a general idea of overall
system overhead. In fact, looking at the raw results, we can note that
the running times often vary quite a lot. For a more trustworthy
measurement, one must look at the results found by LMbench.

Note that the total running time of all configurations under the
"plain" and "IRQ test" loads has gone down significantly (by about
20 to 25 seconds). This is likely the result of an optimization within
2.6.12-final. Most of the other results for the other loads are,
nevertheless, in the same range of values found previously, except the
"IRQ & hd" results for PREEMPT_RT which show an improvement.

LMbench running times:
+--------------------+-------+-------+-------+-------+-------+
| Kernel             | plain | IRQ   | ping  | IRQ & | IRQ & |
|                    |       | test  | flood | ping  |  hd   |
+====================+=======+=======+=======+=======+=======+
| Vanilla-2.6.12     | 150 s | 153 s | 183 s | 187 s | 242 s |
+====================+=======+=======+=======+=======+=======+
| with RT-V0.7.50-35 | 155 s | 155 s | 205 s | 206 s | 250 s |
+--------------------+-------+-------+-------+-------+-------+
| %                  |  3.3  |  1.3  | 12.0  | 10.2  |  3.3  |
+====================+=======+=======+=======+=======+=======+
| with Ipipe-0.7     | 153 s | 154 s | 193 s | 198 s | 259 s |
+--------------------+-------+-------+-------+-------+-------+
| %                  |  2.0  |  0.6  |  5.5  |  5.9  |  7.0  |
+--------------------+-------+-------+-------+-------+-------+

Legend:
plain          = Nothing special
IRQ test       = on logger: triggering target every 1ms
ping flood     = on host: "sudo ping -f $TARGET_IP_ADDR"
IRQ & ping     = combination of the previous two
IRQ & hd       = IRQ test with the following being done on the target:
                 "while [ true ]
                     do dd if=/dev/zero of=/tmp/dummy count=512 bs=1m
                  done"

Looking at the LMbench output, which is clearly a more adequate
benchmark, here are some highlights (this is averaged on 5 runs using
lmbsum):

"plain" run:

Measurements   |   Vanilla   |  preemp_rt     |   ipipe
---------------+-------------+----------------+-------------
fork           |      93us   |  157us (+69%)  |   95us (+2%)
open/close     |     2.3us   |  3.7us (+43%)  |  2.4us (+4%)
execve         |     351us   |  446us (+27%)  |  363us (+3%)
select 500fd   |    12.7us   | 25.8us (+103%) | 12.8us (+1%)
mmap           |     660us   | 2867us (+334%) |  677us (+3%)
pipe           |     7.1us   | 11.6us (+63%)  |  7.4us (+4%)


"IRQ test" run:
Measurements   |   Vanilla   |  preemp_rt     |   ipipe
---------------+-------------+----------------+-------------
fork           |      96us   |  158us (+65%)  |   97us (+1%)
open/close     |     2.4us   |  3.7us (+54%)  |  2.4us (~)
execve         |     355us   |  453us (+28%)  |  365us (+3%)
select 500fd   |    12.8us   | 26.0us (+103%) | 12.8us (~%)
mmap           |     662us   | 2893us (+337%) |  679us (+3%)
pipe           |     7.1us   | 13.2us (+86%)  |  7.5us (+6%)


"ping flood" run:
Measurements   |   Vanilla   |  preemp_rt     |   ipipe
---------------+-------------+----------------+-------------
fork           |     137us   |  288us (+110%) |  162us (+18%)
open/close     |     3.9us   |  7.0us (+79%)  |  4.0us (+3%)
execve         |     562us   |  865us (+54%)  |  657us (+17%)
select 500fd   |    19.3us   | 47.4us (+146%) | 21.0us (+4%)
mmap           |     987us   | 4921us (+399%) | 1056us (+7%)
pipe           |    11.0us   | 23.7us (+115%) | 13.3us (+20%)


"IRQ & ping" run:
Measurements   |   Vanilla   |  preemp_rt     |   ipipe
---------------+-------------+----------------+-------------
fork           |     143us   |  291us (+103%) |  163us (+14%)
open/close     |     3.9us   |  7.1us (+82%)  |  4.0us (+3%)
execve         |     567us   |  859us (+51%)  |  648us (+14%)
select 500fd   |    19.6us   | 52.2us (+166%) | 21.4us (+9%)
mmap           |     983us   | 5061us (+415%) | 1110us (+13%)
pipe           |    12.2us   | 28.0us (+130%) | 12.7us (+4%)


"IRQ & hd" run:
Measurements   |   Vanilla   |  preemp_rt     |   ipipe
---------------+-------------+----------------+-------------
fork           |      96us   |  164us (+71%)  |  100us (+4%)
open/close     |     2.5us   |  3.8us (+52%)  |  2.5us (~)
execve         |     373us   |  479us (+28%)  |  382us (+2%)
select 500fd   |    13.3us   | 27.2us (+105%) | 13.4us (+1%)
mmap           |     683us   | 3013us (+341%) |  712us (+4%)
pipe           |     9.9us   | 23.0us (+132%) | 10.6us (+7%)

These results are consistent with those highlighted during the discussion
following the publication of the last test run.


Interrupt response time:
------------------------

Unlike the two first times, these times are measured using a busy-wait
on the logger. Basically, we disable interrupts, fire the interrupt to
the target, log the time, while(1) until the input on the parallel port
changes, and then log the time again to obtain the response time. Each
of these runs accumulates above 1,000,000 interrupt latency measurements.

We stand corrected as to the method that was used to collect interrupt
latency measurements. Ingo's suggestion to disable all interrupts on
the logger to collect the target's response does indeed mostly eliminate
logger-side latencies. However, we've sporadically ran into situations
where the logger locked-up, whereas it didn't before when we used to
measure the response using another interrupt. This happened around 3 times
in total over all of our test runs (and that's a lot of test runs), so
it isn't systematic, but it did happen.

+--------------------+------------+------+-------+------+--------+
| Kernel             | sys load   | Aver | Max   | Min  | StdDev |
+====================+============+======+=======+======+========+
|                    | None       |  5.7 |  51.8 |  5.6 |  0.3   |
|                    | Ping       |  5.8 |  51.8 |  5.6 |  0.5   |
| Vanilla-2.6.12     | lm. + ping |  6.2 |  83.5 |  5.6 |  1.0   |
|                    | lmbench    |  6.0 |  57.6 |  5.6 |  0.9   |
|                    | lm. + hd   |  6.5 | 177.4 |  5.6 |  4.1   |
|                    | DoHell     |  6.9 | 525.4 |  5.6 |  5.2   |
+--------------------+------------+------+-------+------+--------+
|                    | None       |  5.7 |  47.5 |  5.7 |  0.2   |
|                    | Ping       |  7.0 |  63.4 |  5.7 |  1.6   |
| with RT-V0.7.50-35 | lm. + ping |  7.9 |  66.2 |  5.7 |  1.9   |
|                    | lmbench    |  7.4 |  51.8 |  5.7 |  1.4   |
|                    | lm. + hd   |  7.3 |  53.4 |  5.7 |  1.9   |
|                    | DoHell     |  7.9 |  59.1 |  5.7 |  1.8   |
+--------------------+------------+------+-------+------+--------+
|                    | None       |  7.1 |  50.4 |  5.7 |  0.2   |
|                    | Ping       |  7.3 |  47.6 |  5.7 |  0.4   |
| with Ipipe-0.7     | lm.+ ping  |  7.7 |  50.4 |  5.7 |  0.8   |
|                    | lmbench    |  7.5 |  50.5 |  5.7 |  0.7   |
|                    | lm. + hd   |  7.5 |  51.8 |  5.7 |  0.7   |
|                    | DoHell     |  7.6 |  50.5 |  5.7 |  0.7   |
+--------------------+------------+------+-------+------+--------+

Legend:
None           = nothing special
ping           = on host: "sudo ping -f $TARGET_IP_ADDR"
lm. + ping     = previous test and "make rerun" in lmbench-2.0.4/src/ on
target
lmbench        = "make rerun" in lmbench-2.0.4/src/ on target
lm. + hd       = previous test  with the following being done on the target:
                 "while [ true ]
                     do dd if=/dev/zero of=/tmp/dummy count=512 bs=1m
                  done"
DoHell         = See:
http://marc.theaimsgroup.com/?l=linux-kernel&m=111947618802722&w=2

We don't know whether we've hit the maximums Ingo alluded to, but we
did integrate his dohell script and the only noticeable difference was
with Linux where the maximum jumped to 525.4 micro-seconds. But that
was with vanilla only. Neither PREEMPT_RT nor I-PIPE exhibited such
maximums under the same load.


Overall analysis:
-----------------

We had not intended to redo a 3rd run so early, but we're happy we did
given the doubts expressed by some on the LKML. And as we suspected, these
new results very much corroborate what we had found earlier. As such, our
conclusions remain mostly unchanged:

- Both approaches yield similar results in terms of interrupt response times.
On most runs I-pipe seems to do slightly better on the maximums, but there
are cases where PREEMPT_RT does better. The results obtained in this
3rd round do corroborate the average latency extrapolated from analyzing
previous runs, but they contradict our earlier analysis of the true
maximum delays. Instead, it was Paul McKenny that was right, the maximums
are indeed around 50us. Also, Ingo was right in as far as we found a
maximum for vanilla Linux that went all the way up to 525us. We stand
corrected on both these issues. Nevertheless, the above results are
consistent with previously published ones.

- In terms of system load, PREEMPT_RT is typically still higher in
cost than I-pipe, as had been seen in the two previous runs. Please note
that we've done our best to use the truly latest version of PREEMPT_RT
available at this time. We actually did on purpose to finish all other
tests before doing PREEMPT_RT runs in order to make sure we were using
the latest possible release, as this was one of the main grievances
expressed by those supporting PREEMPT_RT.

For a more complete discussion of our conclusions please see our previous
test results publication:
http://marc.theaimsgroup.com/?l=linux-kernel&m=111928813818151&w=2

Again, we hope these results will help those interested in enhancing Linux
for use in real-time applications to better direct their efforts.

Kristian Benoit
Karim Yaghmour
--
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || 1-866-677-4546
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux