This is a 2nd try with correct email address, sorry for the duplicates.
Am Sonntag, 12. August 2007 schrieb Ingo Molnar:
> Linus, please pull the latest scheduler git tree from:
Hello Ingo,
this is a followup to the discussion in
http://lkml.org/lkml/2007/7/19/538
Since 2.6.12, s390 already does precise accouting for system and user time.
Depending on CONFIG_VIRT_CPU_ACCOUNTING, we use two 64bit hardware timers on
s390: the first returns the wall clock time and is stepped even if the
virtual cpu is not backed by a physical cpu. The second timer is only
stepped, when the virtual cpu is backed by a physical cpu. The timers have a
very high accurancy, and the architecture guarantees that bit 51 is increased
by one/microsecond. We store both timers on each context switch, irq,
syscall, and machinecheck in entry.S. The calculation are made in
arch/s390/kernel/vtime.c in accouting_system_vtime and friends with
microsecond accurracy. This is also used for irq accouting (see the
definition of irq_enter). It basically boils down to precise numbers in the
cpu stat and the utime/stime for processes as well as knowledge about time
stolen by the hypervisor.
With CFS the accounting was changed, and everything is now based on
sum_exec_runtime. There is now an accounting regression on s390 (and maybe
ppc64), as the default jiffy implemenation does not know anything about
virtual cpus.
While looking for a solution, I started with a very quick hack and reverted
b27f03d4bdc145a09fb7b0c0e004b29f1ee555fa for the procfs related changes. If I
revert that commit, it seems that I get the old behaviour - but of course
this is just a hack.
I see some options now:
1. Jan could finish his sched_clock implementation for s390 and we would get
close to the precise numbers. This would also let CFS make better decisions.
Downside: its not as precise as before as we do some math on the numbers and
it will burn cycles to compute numbers we already have
(utime=sum*utime/stime).
2. set sum_exec_runtime based on the precise utime and stime. Dont know enough
about CFS if this would show different scheduling behaviour than 1
3. ifdef fs/proc/array.c depending on CONFIG_VIRT_CPU_ACCOUNTING. This will
save some cycles, and the numbers are precise to a microsecond. Downside: the
scheduler gets no information about virtual cpus and steal time so its
probably not completely fair
4. implement sched_clock AND reuse the exisiting utime and stime numbers.
5. other clever solutions I cannot see
Any suggestions?
Christian
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]