Re: Help with the high res timers

The issue is not its length, but can we interrupt at its end. The key to highres is being on time. If we want a 100usec resolution timer, we need to havethe low res timer hit its mark with something real close to this.
Indeed.
As an aside, one nice thing with Nish's code is that the maximum soft-
timer latency is only the maximum interval between ticks. This avoids
accumulating error caused by lost ticks.

I think you are confusing the x86 problems with lost ticks with the wider world.Most archs have reasonable time resources and, while they may suffer from lateticks, do not, therefor loose time. In as much as you can find a way toeliminate time loss in the x86 due to lost ticks of the time base, you can alsodo so with the current time keeping system. The current code is just todependant on catching every timer tick. It should use the other resourcesavailable to it to cover any missing ticks. (We do this rather nicely in theHRT patch, by the way.)

Now when the new code expires timers it does so against the timeofday
subsystem's notion of time instead of jiffies. It simply goes through
all the timer bucket entries between now and the last time we expired
timers.
This is not unlike what happens now. I would hope that the number of bucketsvisited averages to to something real close to 1 per run_timers entery.



Well, as long as the HZ period is close to the timer-interval unit
length, this is true. However if the timer-interval unit is smaller,
multiple bucket entries would be expired. The performance considerations
here are being looked at and this may be an area where the concepts in
HRT might help (having a HRT specific sub-bucket).

This is where we get in trouble with HR timers. For a HR timer, we need to knowhow to get a timer to expire (i.e. appear in the call back) at a well definedand presise time (leaving aside latency issues). The above discription allowstimers to be put in buckets without (as near as I can tell) making transparentexactly when the bucket will be emptied, only saying that it will be after thelatest timer in the bucket is due.

Now an interesting point you bring up above is when to schedule timer
interrupts. One could just have a fine-grained timer-interval unit and
crank up HZ, but clearly we don't want to schedule interrupts to go off
too frequently or the overhead won't be worth it. Instead to get high-
res timers, the idea is to look at the timer list and schedule
interrupts appropriately. Then we only take an interrupt when there is
work to do, rather then at regular periodic intervals when there may or
may not be anything to expire.


This would suggest that all timers are high res.  I don't think this makes sense:

1) because most users just don't care that much about resolution and high-rescarries some overhead.

2) Not all platform hardware is able to handle high res.



Indeed, in our system all timers could be high res, but don't
necessarily need to be by default. It ends up being a function of how
finely-grained the timer-interval units are set to be and how
efficiently the hardware can be scheduled.

We currently ship HRT with a resolution of 1usec. Lets agree that you don'twant to even try to do this by adjusting the timer-interval.

I think high res timers should only be used when the user asks for them. Thiskeeps the overhead under control.
Abstractly that makes sense, but I'm not sure how you mean "when the
user asks for them". Is this a runtime consideration, or is it compile
time?

Run time, he uses the POSIX clocks and timers interface and uses a seperate highres clock. Only timers on high res clocks are high res.

A small, measureable latency is ok and need not be backed out by the software.If you go this route you risk expiring a timer early and the standard says badthings about this.
Since we expire based upon time instead of ticks, we can never expire
early.

Think of it this way. Decompose a HR timer into corse and fine units (youchoose, but here let say jiffies and nanoseconds). Now we want the normal timersystem to handle the jiffies part of the time and to turn the timer over to theHR timer code to take care of the nanosecond remainder. If the jiffie part islate, depending on the nanosecond part, it could make the timer late (i.e forlow values of the nanosecond part). For high values of the nanosecond part, wecan compenstate...

This decomposition makes a lot of sense, by the way, for, at least, thefollowing reasons:

1) it keeps the most of the HR issues out of the normal timer code,

2) it keeps high res and low res timer in the correct time order, i.e. a low restimer for jiffie X will expire prior to a high res timer for jiffie X + Ynanoseconds.3) handling the high res timer list is made vastly easier as it will only needto have a rather small number of timers in it at any given time (i.e. those thatare to expire prior to the next corse timer tick).

What is missing in this is that the flag ship arch (x86) has piss poorcapability to schedule timer interrupts. I.e. there really is no commonlyavailable timer to generate interrupts at some arbitrary time. There is,however, the PIT, which, if you don't mess with it, is very low overhead butperiodic.
Then there is the issue of what the time standard is. In the x86, the PIT isit. The pm_timer is close, but the read problem adds so much overhead as tomake it hard to justify. Possibly the HPET does a better job, but it is stillan I/O access (read SLOW) and is not commonly available as yet.
With my rework, the time standard issue is separate problem domain from
HRT. If the timeofday code cannot give correct time, its a bug in that
subsystem. While you're point is a fair critique of my timeofday work
(which I am trying to address), the clear interface between soft-timers
and timeofday allows for the soft-timer subsystem to not worry about
that.

Leaving aside timers, one issue I am trying to get you to address is that onmost X86 machines the "clock" rock is only expressed via PIT interrupts. Whilewe must express time with more resolution than this, we must also "use" thatrock (i.e. PIT) to keep decent long term time.

You also have to do something reasonable for the accounting subsystems.Currently this is done by sampling the system at a periodic rate. What ever isrunning gets charged with the last period. If you go non-periodic, either youhave to charge a variable period when the sample is taken or you have to set uptimers as part of context switching. This ladder is not wise as the contextswitch overhead then gets larger. It is rather easy to show that the accountingoverhead in such a system with a modest load is higher than in a periodicsampled system. This system is also charged with doing the time slicing. Without a periodic tick, a timer needs to be set up each context switch.
I've not looked at the accounting subsystem yet, but I'll try to dig in
and see what we can do here. Thanks for the heads up.


This is the main reason a tick less system is unwise.  It is overload prone.

In addition to all of this, there is the issue of how to organize the timer listso that you can find the next timer. With the current organization this issomething you just don't want to do. On the other hand, I am convinced that,for periodic timers, it is about as good as it gets.
You might have to go a bit more into detail on this last point.


First, lets agree that we need not be in love with any given timer list structure.

The issues to be addressed are:
1) Fast insert.
2) Fast removal prior to expire, (almost all timers never expire).
3) Fast look up and removal of timers to expire NOW

If you want to find the next timer to expire at some time in the furture:
4) Fast look up of the next timer.

The current cascade timer list does a very good job of 1, 2, and 3 but is notset up to make 4 easy or fast.


One thing I'd like to emphasize is that while Nish and my work do change
a large amount of code that collides with your code, we want to make
sure that what the HRT patch achieves is still possible. As I get more
familiar with the HRT code needs and you get more familiar with what
we're providing I hope things will work smoothly.

As do I.
--
George Anzinger   [email protected]
High-res-timers:  http://sourceforge.net/projects/high-res-timers/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Prev by Date: Re: Help with the high res timers
Next by Date: Re: Mercurial 0.4b vs git patchbomb benchmark
Previous by thread: Re: Help with the high res timers
Next by thread: Re: Help with the high res timers
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]