Re: PowerOP 0/3: System power operating point management API

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


Dominik Brodowski wrote:

First, the table interface you suggest is ugly. If there's indeed the need for
such an abstraction, I'd favour something like
I'm planning to adopt the previous suggestions of an opaque data
structure and stop trying to have any generic structure to it. I'll try
to leave dependency checking etc. to the upper layers as much as
possible, since platforms vary greatly in this and so do the needs of
different PM s/w stacks.
Secondly, you do not adress the cross-relationships between operation points
correctly. If you change the CPU frequency, you may have to switch other
(memory, video) settings; you might even have to validate the frequency
settings for these or even additional reasons (thermal and battery reasons -
This lowest layer basically assumes that upper-layer software has
created an appropriate operating point (for example, in DPM we pretty
much require a system designer to create operating points that match the
h/w specs and don't go to great lengths to encode rules about this),
and/or will call driver notifiers etc. as needed to adapt to the
changes. Although there may be some sanity checking appropriate at the
PowerOP level, cpufreq, DPM, etc. can for the most part continue to
handle the larger issues of how valid operating points are constructed,
driver callbacks, etc. If you do want to handle various dependencies at
the PowerOP layer then there's nothing that prevents that, but PM
frameworks tend to embody assumptions about how frequently operating
points will change and in what contexts (interrupt, idle...), and this
can influence the code for such things.
Thirdly, who is to decide on the power management settings? The first and
intuitive answer is the kernel. Therefore, kernel-space cpufreq governors
exist. Only under rare circumstances, you want full userspace control --
that's what the userspace cpufreq governor is for.
Also something left to the existing upper layers; PowerOP isn't intended
to handle any of that. In the embedded space we usually let the system
designer choose operating points supported by their h/w vendor and that
match their particular system states (hardware enabled at any point in
time, type and power/performance needs of software currently running).
We do recommend that a userspace power policy manager be the component
in charge of PM settings, based on messages from drivers and other apps
on the state of the system. And so that userspace component activates
the operating point (or set of operating points in the case of DPM)
appropriate for current state.
Foruthly, the code duplication which your implementation leads to is obvious
for the speedstep-centrino case.
We could move the tables of valid cpu speeds and corresponding voltages
down to the PowerOP level, and there would probably be little
duplication at that point (in fact, with the current patch there's not a
lot of duplication since the actual MSR access was moved to PowerOP and
PowerOP contains little else, but both levels know how to understand the
MSR format, and a more aggressive port to PowerOP could do away with that).
Your suggestions of changes to cpufreq governors and policies to handle
governance of non-cpu-speed parameters sound interesting, and I'd be
happy to help figure out what to do about those vs. the lower machine
access layer I've discussed up until now. I'll think more about this
real soon now. Thanks,
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at
Please read the FAQ at

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]
  Powered by Linux