Re: [Fastboot] [PATCH 03/03] kexec: Avoid overwriting the current pgd (V2, x86_64)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Magnus Damm <[email protected]> writes:

> Let me explain the theory behind my kexec patch:

Please the next time you submit a please try to keep what
you are changing sufficiently clear that a followup email
is not needed to explain the code.

> During prepare, page_table_a is initialized. For x86_64 page_table_b is
> also initialized.
>
> When it is time for machine_kexec() the following sequence takes place:
>
> 1. The C-code in machine_kexec() jumps to the assembly routine.
> To avoid overwriting the page table and still work with NX bits we jump
> to the original location of relocate_new_kernel. This is different from
> the unpatched code which jumps to the physical address (ie the identity
> mapped location).
>
> 2. The assembly code switches to page_table_a.
> page_table_a has mapped the control page at two virtual addresses:
> - The same virtual address as relocate_new_kernel is located at.
>   (This explains the extra aligment in the assembly file)
> - An identity mapping, ie virtual address == physical address.
> After the switch the code runs at the same virtual address, but the
> physical page is now the control page.

Sorry this is broken.  The code location you are running from when you
perform the switch from one page table to another must be mapped at
the same location in both page tables.  Otherwise it is undefined what
the processor will do.

So you need at least one additional entry in your page table.

The fact that this bug did not jump out is a clear sign you were
changing too many things at once, and did not have an adequate
explanation in your change log.

> 3. Setup idt, gdt, segment registers and stack pointer.
> The stack pointer should point to the identity mapped page.
>
> 4. Jump to the identity mapped address.
> When the jump is performed we will be running at a virtual address which
> is the same as the physical address.

After the jump is performed?

> 5. Turn off MMU (i386) / switch to page_table_b (x86_64).
> We are able to turn off the MMU or switch to page_table_b because we are
> already running at the physical address.
>
> 6. Proceed with the page copying business as usual...
>
> Ok, so far so good.
>
> The fun part begins when we throw in Xen into the mix. Linux under xen
> runs with pseudophysical addresses, ie what Linux thinks are physical
> addresses are not physical. Xen use the term machine addresses for
> addresses that are called physical address under "regular" Linux. On top
> of that is Xen using a different memory map than Linux.
>
> After prepare, all pages in page_table_a are passed to the hypervisor
> that overwrites the contents filled in by machine_prepare(). 
> (this explains the "ridiculous" array of struct page *)

Actually it doesn't really, because I don't have enough information
to infer which Xen call you are using or why it is sane.  Certainly
Xen does not need struct page * pointers or Xen is too tightly coupled
with linux to be sane.

> A similar two page mapping is used for here too, but in the xen case we
> use a different virtual address (the non-identity mapped address)
> compared to "regular" Linux. All to fit the address space used by xen.
>
> The xen port which is based on my patches is using a sequence similar to
> "regular" Linux:
>
> 1a. The C-code in xen_machine_kexec() performs a hypercall.
>
> 1b. The hypervisor jumps to the assembly code.
> After prepare we've created a NX-safe mapping for the control page. We
> jump to that NX-safe address to transfer control to the assembly
> code.

I assume this is a Xen call with the semantics: 
switch page tables and jump to location X in the new page tables.

I assume Xen is still running at this point?

> Goto 2 above.
>
> So, to answer your question regarding two page table copies. You may be
> right that it can be made work with just one page table, but I feel my
> two table approach is nice because it will always work - regardless of
> the memory map used.

Except that the memory map in linux is fixed.  The x86_64 kernel will
run with negative addresses and physical addresses will remain
positive until a decade or two from now when we get 64bit physical
addresses.

Unless linux runs with a different memory map when running under Xen.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Index of Archives]     [Kernel Newbies]     [Netfilter]     [Bugtraq]     [Photo]     [Stuff]     [Gimp]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Video 4 Linux]     [Linux for the blind]     [Linux Resources]
  Powered by Linux