Re: Why preallocate pmd in x86 32-bit PAE?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Linus Torvalds wrote:
> On Fri, 16 Nov 2007, Jeremy Fitzhardinge wrote:
>   
>>> IIRC, the present bit is ignored in the magic 4-entry PGD.  All entries 
>>> have to be present.
>>>       
>> Hm, do you recall what processors that might affect?  As far as I know,
>> current processors will ignore non-present top-level entries.
>>     
>
> Are you sure?
>   

3.8.5 in vol 3a "Page-Directory and Page-Table Entries With Extended
Addressing Enabled":

    The present flag (bit 0) in the page-directory-pointer-table entries
    can be set to 0 or 1. If the present flag is clear, the remaining
    bits in the page-directory-pointer-table entry are available to the
    operating system. If the present flag is set, the fields of the
    page-directory-pointer-table entry are defined in Figures 3-20 for
    4-KByte pages and Figures 3-21 for 2-MByte pages.

So I would assume this works on all current CPUs, but I can imagine that
some older/off-brand processors might get it wrong.

> Anyway, this is not worth making a distinction for. Just pre-allocate all 
> of them. There really is just 4 PGD entries, and it really *is* different 
> from having a full three-level page table, and of the four PGD entries:
>
>  - one is used for the kernel mapping (assuming the regular 1:3 layout)
>  - AT LEAST two are required by user space anyway
>
> so pre-allocating is never going to waste more than one page.
>   

Yeah, I'm not so concerned about memory saving; I don't think there
would be any in practice.

> And you may feel that pre-allocating is a special case, but it's an 
> *easier* special case than the one that you are apparently thinking about 
> (which is to special-case according to CPU version).
>   

I'm hoping to avoid special-casing anything, if I can help it, aside
from the normal 32/64-bit 2/3/4-level parameterising of the various
pagetable accessors.

> So don't do it. Just preallocate for the magic 4-entry PGD. You can make 
> the special case just be something like
>
> 	/* Preallocate for small PGD's */
> 	#if PTRS_PER_PGD == 4
> 		for (i = 0; i < USER_PTRS_PER_PGD; i++) {
> 			pmd_t *pmd = pmd_alloc();
> 			set_pgd(pgd+i, __pgd(PAGE_PRESENT | __pa(pmd));
> 		}	
> 	#endif
>
> or similar. 
>
> There is absolutely *zero* reason not to do this, and there is also zero 
> reason to make this be a "32-bit vs 64-bit" issue. The code can be there 
> in both, and the #if could even be all in C code (ie there may be reasons 
> to prefer writing it as
>
> 	/* The old-style PAE PGD needs to be preallocated */
> 	if (USER_PTRS_PER_PGD <= 4) {
> 		...
> 	}
>
> and the compiler should even compile it away entirely for all practical 
> cases even without using the preprocessor.
>   

Perhaps.  And there's the corresponding difference between 32 and 64 bit
on freeing a pagetable; 32-bit assumes the pgd destructor will free the
pmd, whereas 64-bit does it separately.  Even in the current 32-bit
code, there's separate handling for PAE and non-PAE.  I think it can all
be collapsed down in a reasonable way.

>> That just means we need to reload cr3 after populating the pgd with a
>> new pmd, right?
>>     
>
> BUT ONLY FOR THIS CASE!
>
> And if you preallocate it, you make *that* special case go away. 
>   

Yes, that is a bit awkward; it means that 32-bit PAE would need a
speparate pgd_populate.  But that seems like a smaller change than 1)
making 32-bit PAE pgd-alloc preallocate the pmd, and 2) making pmd_free
noop on 32-bit PAE, and 3) making pgd_free free the preallocated pmd. 
Perhaps 2 & 3 aren't necessary and can be the same as 64-bit.

I'll need to look into it more carefully.

> So you're going to have special cases regardless. Do the simple and 
> really straightforward one, please! Nothing subtle.
>   

Yep, absolutely.

    J
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux