Re: Industry db benchmark result on recent 2.6 kernels

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



* Paul Jackson <[email protected]> wrote:

> Nick wrote:
> > Ingo had a cool patch to estimate dirty => dirty cacheline transfer latency
> > ... Unfortunately ... and it is an O(cpus^2) operation.
> 
> Yes - a cool patch.
> 
> If we had an arch-specific bit of code, that for any two cpus, could 
> give a 'pseudo-distance' between them, where the only real 
> requirements were that (1) if two pairs of cpus had the same 
> pseudo-distance, then that meant they had the same size, layout, kind 
> and speed of bus amd cache hardware between them (*), and (2) it was 
> cheap - hardly more than a few lines of code and a subroutine call to 
> obtain, then Ingo's code could be:

yeah. The search can be limited quite drastically if all duplicate 
constellations of CPUs (which is a function of the topology) are only 
measured once.

but can be 'pseudo-distance' be calculated accurately enough? If it's a 
scalar, how do you make sure that unique paths for data to flow have 
different distances? The danger is 'false sharing' in the following 
scenario: lets say CPUs #1 and #2 are connected via hardware H1,H2,H3, 
CPUs #3 and #4 are connected via H4,H5,H6. Each hardware component is 
unique and has different characteristics. (e.g. this scenario can happen 
when different speed CPUs are mixed into the same system - or if there 
is some bus assymetry)

It has to be made sure that H1+H2+H3 != H4+H5+H6, otherwise false 
sharing will happen. For that 'uniqueness of sum' to be guaranteed, one 
has to assign power-of-two values to each separate type of hardware 
component.

[ or one has to assing very accurate 'distance' values to hardware 
components. (adding another source for errors - i.e. false sharing of 
the migration value) ]

and even the power-of-two assignment method has its limitations: it 
obviously runs out at 32/64 components (i'm not sure we can do that), 
and if a given component type can be present in the same path _twice_, 
that component will have to take two bits.

or is the 'at most 64 different hardware component types' limit ok? (it 
feels like a limit we might regret later.)

	Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux