Andrew Morton wrote:
Doh. Too many Zachs.
Begin forwarded message:
Date: Sat, 4 Mar 2006 18:43:19 -0800
From: Andrew Morton <[email protected]>
To: Edgar Hucek <[email protected]>
Cc: [email protected], "Zach, Yoav" <[email protected]>, Matt Domsch <[email protected]>
Subject: Re: [PATCH 1/1] EFI: Fix gdt load
Edgar Hucek <[email protected]> wrote:
This patch makes the kernel bootable again on ia32 EFI systems.
Argh, thanks. I'll move the per_cpu() call inside the lock, just in case
we happen to be running preemptibly there.
Zach, Matt: please review, test and ack asap?
Ok, that was subtle. It took me 10 minutes staring at this code to
notice the extra __pa and __va in the load_gdt call. Actually, by sheer
coincidence, the first one was actually still correct. Normally, this
code would just totally blow up, but you've just identity mapped virtual
and physical addresses. The second one will blow up after the EFI call
without the fix.
Unfortunately, I can't test EFI; I have no machines here that are EFI
capable.
This code has always confused me, though. Why do we do this crazy hack
to begin with? The crazy hack is not remapping the GDT in physical
space, or simulating non-paging memory with paging enabled - that is
completely normal. But why do we muck with the GDT for CPU zero instead
of the current CPU? If the EFI code decides to reload FS or GS, we have
now leaked the user FS or GS from CPU zero onto the current CPU, and I
see no code here which restricts EFI to run on the BSP. This will break
userspace TLS programs. Of course, I have no evidence that EFI will
reload FS or GS, but it must be doing something with segmentation, or
you would not have needed to reload the GDT.
Second, there is another bug in this code as well. Why do we care if
PSE is enabled when identity mapping virtual to physical space? PSE has
_nothing_ to do with this. You are copying top level page ranges, which
are the same size, with or without PSE. We should be checking if PAE is
enabled, and we shouldn't even need to check, since it will either be
compiled in or not. This code is scarily just quite lucky that the
kernel is small enough to fit.
For PAE mode, PSE is always going to be enabled (I believe), so you end
up remapping 1GB of virtual space into physical space. For non-PAE, PSE
may or may not be enabled, in which case, you end up remapping either
4MB or 8MB of the kernel virtual address space back at zero.
I don't believe 4MB is enough to make sure all of the per-cpu variables
can be safely referenced, although I could be wrong. So if there are
EFI machines out there with processors installed that have no PSE
support, and the kernel gets large enough, this code blows up again. I
actually think that is quite likely as EFI becomes more prevalent and
older core processors continue to be made for the embedded market.
Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
[Index of Archives]
[Kernel Newbies]
[Netfilter]
[Bugtraq]
[Photo]
[Stuff]
[Gimp]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Video 4 Linux]
[Linux for the blind]
[Linux Resources]