Re: [linux-pm] Re: Hibernation considerations

On Jul 19, 2007, at 12:31 PM, [email protected] wrote:

On Thu, 19 Jul 2007, Milton Miller wrote:
(2) Upon start-up (by which I mean what happens after the user haspressed
     the power button or something like that):
* check if the image is present (and valid) _without_ enablingACPI (we
 don't
     do that now, but I see no reason for not doing it in the new
 framework)
   * if the image is present (and valid), load it
   * turn on ACPI (unless already turned on by the BIOS, that is)
   * execute the _BFS global control method
   * execute the _WAK global control method
   * continue
   Here, the first two things should be done by the image-loading
 kernel, but
the remaining operations have to be carried out by the restoredkernel.
Here I agree.
Here is my proposal. Instead of trying to both write the image andsuspend, I think this all becomes much simpler if we limit the scopethe work of the second kernel. Its purpose is to write the image.After that its done. The platform can be powered off if we are goingto S5. However, to support suspend to ram and suspend to disk, wereturn to the first kernel.
This means that the first kernel will need to know why it gotresumed. Was the system powered off, and this is the resume from theuser? Or was it restarted because the image has been saved, and itsnow time to actually suspend until woken up? If you look at it, thisis the same interface we have with the magic arch_suspend hook -- didwe just suspend and its time to write the image, or did we justresume and its time to wake everything up.
I think this can be easily solved by giving the image saving kerneltwo resume points: one for the image has been written, and one for werebooted and have restored the image. I'm not familiar with ACPI.Perhaps we need a third to differentiate we read the image from S4instead of from S5, but that information must be available to the OSbecause it needs that to know if it should resume from hibernate.
are we sure that there are only 2-3 possible actions? or should thisbe made into a simple jump table so that it's extendable?

At 2 I don't think we need a jump table. Even if we had a table, wehave to identify what each entry means. If we start getting more thenwe can change from command line to table.

As noted in  the thread

Message-ID: <[email protected]>
Subject: [linux-pm] Re: hibernation/snapshot design
on Mon Jul  9 08:23:53 2007, Jeremy Maitin-Shepard wrote:
 (3) how to communicate where to save the memory
This is an intresting topic. The suspended kernel has most IO anddisk space. It also knows how much space is to be occupied by thekernel. So communicating a block map to the second kernel would bethe obvious choice. But the second kernel must be able to find theimage to restore it, and it must have drivers for the media. Also,this is not feasible for storing to nfs.
I think we will end up with several methods.
One would be supply a list of blocks, and implement a file systemthat reads the file by reading the scatter list from media. Therestore kernel then only needs to read an anchor, and can build uponthat until the image is read into memory. Or do this in userspace.
I don't know how this compares to the current restore path. Iwasn't able to identify the code that creates the on disk structurein my 10 minute perusal of kernel/power/.
A second method will be to supply a device and file that will bemounted by the save kernel, then unmounted and restored. This wouldrequire a partition that is not mounted or open by the suspendedkernel (or use nfs or a similar protocol that is designed formultiple client concurrent access).
A third method would be to allocate a file with the first kernel, andmake sure the blocks are flushed to disk. The save and restorekernels map the file system using a snapshot device. Writing wouldmap the blocks and use the block offset to write to the real deviceusing the method from the first option; reading could be donedirectly from the snapshot device.
The first and third option are dead on log based file systems (wherethe data is stored in the log).
remember that the save and restore kernel can access the memory of thesuspending kernel, so as long as the data is in a known format andthere is a pointer to the data in a known location, the save andrestore kernel can retreive the data from memory, there's no need toinvolve media.

I agree that the the save kernel can read the list from the being-savedkernel.

However, when restoring, the being-saved (being-restored) kernel is notaccessable, so the save list has to be stored as part of the image.

Simplifying kjump: the proposal for v3.
The current code is trying to use crash dump area as a safe, reservedarea to run the second kernel. However, that means that the kernelhas to be linked specially to run in the reserved area. I think weneed to finish separating kexec_jump from the other code paths.
on x86 at least it's possible to compile a relocateable kernel, so itdoesn't need to be compiled specificly for a particular reserved area.This would allow you to use the same kernel build as the suspendingkernel if you wanted to (I think that the config of the save andrestore kernel is going to be trivial enough to considerauto-configuring and building a specific kernel for each box a realpossibility)

Yes, one *can* build x86 relocatable. But there are funny restrictionslike it has to be a bzImage or be loaded by kexec or something. Andnot all architectures have relocatable support. I think making thelists for the exsiting code to swap memory will not be that difficultand it will make the solution have less restrictions. Maybe I shouldshut up and write some code this weekend.

Actually, I think we can have the dedicated area as an option. If yoususpend frequently keep a relocated kernel booted. If you need moreram or suspend infrequently allocate the pages on the fly.

As a first stage of suspend and resume, we can save to dedicatedpartitions all memory (as supplied to crash_dump) that is not markednosave and not part of the save kernel's image. The fancy blocklists and memory lists can be added later.
if the suspending kernel needs to tell the save and restore kernelwhat memory is not marked nosave have it do so useing a memory list ofsome kind. you need to setup a mechanism for communicating the dataanyway, setup a mechansim that's useable in the long term.

I'm saying we can have people start to test by the simple save all ramto dedicated while we figure out what the long term list looks like.

If we want to keep the second kernel booted, then we need to add asave area for the booted jump target. Note that the save andrestore lists to relocate_new_kernel can be computed once and saved.Longer term we could implement sys_kexec_load(UNLOAD) that wouldretrieve the saved list back to application space to save to disk ina file. This means you could save the booted save kernel, it justcouldn't have any shared storage open.
since the kexec to the second kernel needs to handle the deviceintialization, do you really save much by doing this? from areliability point of view it would seem simpler (and therefor morereliable) to initialize the save and restore kernel each time it'sused, so that it always does the same thing (as opposed to carryingstate from one use to the next)

You can save a bit of run time initialization, at the cost of savingthe whole image with the initialized pages instead of zeroinguninitialized pages. The code to restore the devices is the same codepath as the code for the main kernel to restore the devices (asimplemented in the current patch), so we get more testing of that path.

David Lang

milton

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [linux-pm] Re: Hibernation considerations
  - From: [email protected]

References:
- Re: Hibernation considerations
  - From: [email protected]
- Re: Hibernation considerations
  - From: Alan Stern <[email protected]>
- Re: Hibernation considerations
  - From: "Rafael J. Wysocki" <[email protected]>
- Re: Hibernation considerations
  - From: Jeremy Maitin-Shepard <[email protected]>
- Hibernation considerations
  - From: "Rafael J. Wysocki" <[email protected]>
- Re: Hibernation considerations
  - From: "Rafael J. Wysocki" <[email protected]>
- Re: [linux-pm] Re: Hibernation considerations
  - From: [email protected]

Prev by Date: Re: [PATCH] Use descriptor's functions instead of inline assembly
Next by Date: Re: [PATCH] [15/58] i386: Rewrite sched_clock
Previous by thread: Re: [linux-pm] Re: Hibernation considerations
Next by thread: Re: [linux-pm] Re: Hibernation considerations
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]