[PATCH 0/4] new timeofday-based soft-timer subsystem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08.06.2005 [20:11:42 -0700], john stultz wrote:
> Hey Everyone,
> 	I'm heading out on vacation until Monday, so I'm just re-spinning my
> current tree for testing. If there's no major issues on Monday, I'll re-
> diff against Andrew's tree and re-submit the patches for inclusion.

Here is an update of my soft-timer rework to John's latest patches. I
have made some major changes in this revision. I would still greatly
appreciate any comments.

Changes:

	The timerinterval value of the soft-timer's expires is never
	stored. Instead, we store the nanosecond request in a new member
	of struct timer_list, expires_nsecs. This does make the
	structure 64 bits larger, but also means expires is deprecated
	along with the expires-style interfaces (add_timer(),
	mod_timer()) and thus the long term addition is 32 bits.
	Whenever we need the timerinterval value, we calculate it via
	the shift/and operation, which should be pretty quick.

Notes / Blocking Issues:

	No non-NEWTOD support yet, which means this patch isn't
	appropriate for any mainline inclusion yet. It should be a small
	patch to timeofday.h to emulate do_monotonic_clock() [the only
	soft-timer dependence on the timeofday code] with
	jiffies_to_nsecs(jiffies - INITIAL_JIFFIES). But I ran into some
	build issues with that change; I'll try to get a patch out soon,
	though. I was able to test my patch with this emulation by
	moving do_monotonic_clock() in ppc64, which is where I got the
	ppc64 benchmark numbers below.

	NUMA-Q is definitely broken with my patch, but not NUMA itself.
	Honestly not sure why, but timeofday seems to also be broken on
	NUMA-Q -- it sets up the TSC as the timesource, even though it
	shouldn't be (booting with notsc on NUMA-Qs seems to fix the
	problem for John's patches, at least).

Some design points:

1) The patches are small but do a lot.
a) Renames timer_jiffies to last_timer_time (now that we are not
	jiffies-based).
b) Converts the soft-timer time-vector's/bucket's entries to
	timerinterval (a new unit) width, instead of jiffy width.
c) Defines timerintervals to be the current time as reported by the new
	timeofday-subsystem shifted down by TIMERINTERVAL_BITS bits.
	Thus, various pseudo-'human time' units can be emulated. The
	default value for TIMERINTERVAL_BITS is 19.
d) Uses do_monotonic_clock() (converted to timerintervals) as the basis
	for addition and expiration of timers instead of jiffies.
e) Adds some new helper functions for dealing with nanosecond values.

2) The reason for the re-work? Many people complain about all of the
adding of 1 jiffy here or there to fix bugs. This new systems is
fundamentally human-time oriented and deals with those issues correctly
and, more importantly, sanely :)

The code is reasonably well commented, but does expect readers to
understand the current soft-timer subsystem.

This is still an early working of this patch, so I expect criticism, and
am happy to make changes.

Benchmark differentials follow in this mail [1].

Overview:

1/4: Converts the soft-timer subsystem to use timerinterval as the units
of addition and expiration.

2/4: Converts, as an example, sys_nanosleep() to use the new interfaces
provided by patch 2. Example latency values are also below [2],
demonstrating improvements in the min, max and average latency for
sys_nanosleep() (which uses schedule_timeout_nsecs() internally with my
patch).

Thanks,
Nish

[1] Benchmark Differentials on various machines

x86_64, 4-way 1.7 GHz Opteron, 8GB RAM
			Elapsed	User	System	CPU
2.6.12-rc6		100%	100%	100%	100%
2.6.12-rc6-tod		99.67%	99.77%	99.44%	99.79%
2.6.12-rc6-tod-timer	99.8%	99.97%	99.73%	99.79%

non-numaq big x86, 16-way 1.4 GHz Xeon, 15GB RAM
			Elapsed	User	System	CPU
2.6.12-rc6		100%	100%	100%	100%
2.6.12-rc6-tod		99.95%	99.75%	99.51%	99.76%
2.6.12-rc6-tod-timer	100.88%	100.04%	99.55%	99.11%

small x86, 1-way 2.66 GHz P4, 512MB RAM
			Elapsed	User	System	CPU
2.6.12-rc6		100%	100%	100%	100%
2.6.12-rc6-tod		99.4%	102.87%	95.72%	100%
2.6.12-rc6-tod-timer	99.33%	102.69%	97.33%	100%


ppc64, 8-way 1.2GHz Power4, 12GB RAM
			Elapsed	User	System	CPU
2.6.12-rc6		100%	100%	100%	100%
2.6.12-rc6-tod		95.59%	100.04%	101.28%	104.81%
2.6.12-rc6-tod-timer	97.26%	100.04%	100.58%	102.91%

[2] Latency measured via sys_nanosleep(), 1,000,000 calls

			latencies in us, min/max/avg
Request		stock		tod		tod-timer
1000 us		2017/4004/3001	2011/4024/3001	1020/2531/1771
100 us		1022/3001/2000	1028/2995/2001	111/1619/866
1 us		1021/3004/2000	1032/3000/1997	11/1524/764
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux