Re: Dependent CPU core speed reporting not updated withCPUFREQ_SHARED_TYPE_HW?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jun 02, 2007 at 07:19:03AM -0700, Pallipadi, Venkatesh wrote:

> The problem here is that with hardware coordination, kernel need not
> do what we do for affected_cpus today. Kernel can manage each CPU
> independently in terms of setting freq as underlying hardware
> guarantees to do the coordination (picking up the highest freq
> among a group of dependent cpus). So ideally we can just manage cpu
> frequencies as we do today without affected_cpus. But, in this case
> there is a fyi from hardware which says even though OS is thinking that
> CPUs are independent, hardware is doing the coordination across these
> CPUs.

Yes, and ... it appears that (at least on Intel CPUs), writing
IA32_PERF_CTL on any core in the package causes the speed of all CPUs to
be set to the max of all CPU cores.  Here's what I see when
reading/writing the performance control MSRs:

This is with all CPUs set to 2.6GHz.  0x198 = IA32_PERF_STATUS, 0x199 =
IA32_PERF_CTL.

root@elm3a188:/mnt/src/msr/msr-20060208# ./msr -n 0x198 0x199
CPU 0:
   0x00000198 = 0x0828082806000828
   0x00000199 = 0x0000000000000828
CPU 1:
   0x00000198 = 0x0828082806000828
   0x00000199 = 0x0000000000000828

Now, we set scaling_max_freq of CPU 0 to 2GHz:
root@elm3a188:/mnt/src/msr/msr-20060208# ./msr -n 0x198 0x199
CPU 0:
   0x00000198 = 0x0828082806000828
   0x00000199 = 0x000000000000061a
CPU 1:
   0x00000198 = 0x0828082806000828
   0x00000199 = 0x0000000000000828

And now likewise for CPU 1:
root@elm3a188:/mnt/src/msr/msr-20060208# ./msr -n 0x198 0x199
CPU 0:
   0x00000198 = 0x082808280600061a
   0x00000199 = 0x000000000000061a
CPU 1:
   0x00000198 = 0x082808280600061a
   0x00000199 = 0x000000000000061a

Notice that we've written the slower speed to the control register but
the status register says that we're still running at the higher speed.
This seems to corroborate my finding that the power use does not drop
until _both_ cores are lowered.  Unfortunately, in that middle step we
are reporting an incorrect frequency for CPU 0--sysfs says 2GHz but the
hardware itself says 2.6.  This is clearly a bad thing, because I just
set scaling_max_cpufreq on CPU0 to 2GHz, yet it is running 667MHz faster
than it ought to be because CPU1 wants to go faster.

How about this: When hardware coordination is specified in _PSD,
scaling_{min,max}_freq between CPUs in a cpufreq domain are tied
together so that we can be sure that our caps are being followed.
Requests to change speed can be done as they always have, but
afterwards the value of scaling_cur_freq for all CPUs in the cpufreq
domain  will be determined by reading the speed value from hardware
since we can't really be sure how the hardware decided to coordinate
things anyway.  When it becomes the case that individual cores on a
package can run at different speeds, we can drop the _PSD entries.  Does
this scheme sound reasonable?

We might, however, want another sysfs file to tell userspace what kind
of coordination is taking place.

--D

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux