Hi. On Tue, 2007-05-29 at 14:15 +0200, Rafael J. Wysocki wrote: > Please have a look at the current version of the patch (appended). > > I have followed the Nigel's suggestion not to change the current behavior > in this patch (I'll add a couple of patches removing the freezability from > some kernel threads), with one exception: I couldn't figure out any reason > to have try_to_freeze() called in net/sunrpc/svcsock.c:svc_recv() . Thanks. IIRC, svcsock is related to the NFS server code. > I've also added a piece of documentation, freezing-of-tasks.txt . Please > see if it's not missing anything (I'd like it to be quite complete). [...] Mostly just grammar and the odd typo. On the whole, it's really well written and perfectly readable - great job! > Index: linux-2.6.22-rc3/Documentation/power/freezing-of-tasks.txt > =================================================================== > --- /dev/null > +++ linux-2.6.22-rc3/Documentation/power/freezing-of-tasks.txt > @@ -0,0 +1,160 @@ > +Freezing of tasks > + (C) 2007 Rafael J. Wysocki <[email protected]>, GPL > + > +I. What is the freezing of tasks? > + > +The freezing of tasks is a mechanism by which user space processes and some > +kernel threads are controlled during hibernation or system-wide suspend (on some > +architectures). > + > +II. How it works? How does it work? > + > +There are four per-task flags used for that, PF_NOFREEZE, PF_FROZEN, TIF_FREEZE > +and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have > +PF_NOFREEZE unset (all user space processes and some kernel threads) are > +regarded as 'freezable' and treated in a special way before the system enters a > +suspend state as well as before a hibernation image is created (in what follows > +we only consider hibernation, but the description also applies to suspend). > + > +Namely, as the first step of the hibernation procedure the function > +freeze_processes() (defined in kernel/power/process.c) is called. It executes > +try_to_freeze_tasks() that sets TIF_FREEZE for all of the freezable tasks and > +sends a fake signal to each of them. A task that receives such a signal and has > +TIF_FREEZE set, should react to it by calling the refrigerator() function > +(defined in kernel/power/process.c), which sets the task's PF_FROZEN flag, > +changes its state to TASK_UNINTERRUPTIBLE and makes it loop until PF_FROZEN is > +cleared for it. Then, we say that the task is 'frozen' and therefore the set of > +functions handling this mechanism is called 'the freezer' (these functions are > +defined in kernel/power/process.c and include/linux/freezer.h). User space > +processes are generally frozen before kernel threads. > + > +It is not recommended to call refrigerator() directly. Instead, it is > +recommended to use the try_to_freeze() function (defined in > +include/linux/freezer.h), that checks the task's TIF_FREEZE flag and makes the > +task enter refrigerator() if the flag is set. > + > +For user space processes try_to_freeze() is called automatically from the > +signal-handling code, but the freezable kernel threads need to call it > +explicitly in suitable places. The code to do this may look like the following: > + > + do { > + hub_events(); > + wait_event_interruptible(khubd_wait, > + !list_empty(&hub_event_list)); > + try_to_freeze(); > + } while (!signal_pending(current)); > + > +(from drivers/usb/core/hub.c::hub_thread()). > + > +If a freezable kernel thread fails to call try_to_freeze() after the freezer has > +set TIF_FREEZE for it, the freezing of tasks will fail and the entire > +hibernation operation will be cancelled. For this reason, freezable kernel > +threads must call try_to_freeze() somewhere. > + > +After the system memory state has been restored from a hibernation image and > +devices have been reinitialized, the function thaw_processes() is called in > +order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that > +have been frozen leave refrigerator() and continue running. > + > +III. Which kernel threads are freezable? > + > +Kernel threads are not freezable by default. However, a kernel thread may clear > +PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE > +directly is strongly discouraged). From this point it is regarded as freezable > +and must call try_to_freeze() in a suitable place. > + > +IV. Why do we do that? > + > +Generally speaking, there is a couple of reasons to use the freezing of tasks: > + > +1. The principal reason is to prevent filesystems from being damaged after > +hibernation. Namely, for now we have no simple means of checkpointing s/Namely, for now/At the moment/ No simple means or no means at all? Are you thinking of bdev freezing? > +filesystems, so if there are any modifications made to filesystem data and/or > +metadata on disks, we usually cannot bring them back to the state from before If the above is changed, I'd remove 'usually' here. > +the modifications. At the same time each hibernation image contains some > +filesystem-related information that must be consistent with the state of the > +on-disk data and metadata after the system memory state has been restored from > +the image (otherwise the filesystems will be damaged in a nasty way, usually > +making them almost impossible to repair). Therefore we freeze tasks that might s/Therefore we/We therefore/ > +cause the on-disk filesystems' data and metadata to be modified after the > +hibernation image has been created and before the system is finally powered off. > +The majority of them is user space processes, but if any of kernel threads may s/them is/these are/ s/of kernel/of the kernel/ > +cause something like this to happen, they have to be freezable. > + > +2. The second reason is to prevent user space processes and some kernel threads > +from interfering with the suspending and resuming of devices. For example, a > +user space process running on a second CPU while we are suspending devices may I'd shift the "For example" to after "may", giving "...may, for example, be troublesome..." > +be troublesome and without the freezing of tasks we would need some safeguards > +against race conditions that might occur in such a case. > + > +Although Linus Torvalds doesn't like the freezing of tasks, he said this in one > +of the discussions on LKML (http://lkml.org/lkml/2007/4/27/608): > + > +'> Why we freeze tasks at all or why we freeze kernel threads? > + > +In many ways, "at all". I found these first two lines confusing - I though the "Why we freeze..." was Linus, rather than a quotation he was responding to. I'd suggest starting the quote at what follows this point... but then as I read further, I can see the quote is necessary to make sense of the second paragraph below. Perhaps the best way would to put a line before the "Why we freeze..." indicating that you're being quoted there. > +I _do_ realize the IO request queue issues, and that we cannot actually do > +s2ram with some devices in the middle of a DMA. So we want to be able to > +avoid *that*, there's no question about that. And I suspect that stopping > +user threads and then waiting for a sync is practically one of the easier > +ways to do so. > + > +So in practice, the "at all" may become a "why freeze kernel threads?" and > +freezing user threads I don't find really objectionable.' Oh, and double quotes should surround the whole quote, with single quotes replacing the double quotes in the quotation. Hope all those 'quote's aren't confusing! :) > +Still, there are kernel threads that may want to be freezable. For example, if > +a kernel that belongs to a device driver accesses the device directly, it in > +principle needs to know when the device is suspended, so that it doesn't try to > +access it at that time. However, if the kernel thread is freezable, it will be > +frozen before the driver's .suspend() callback is executed and it will be > +thawed after the driver's .resume() callback has run, so it won't be accessing > +the device while it's suspended. > + > +3. Another reason for freezing tasks is to prevent user space processes from > +realizing that hibernation (or suspend) operation takes place. Ideally, user > +space processes should not notice that such a system-wide operation has occured s/occured/occurred/. That word gets me too. > +and should continue running without any problems after the restore (or resume > +from suspend). Unfortunately, in the most general case this is quite difficult > +to achieve without the freezing of tasks. Consider, for example, a process > +that depends on the number of CPUs being online while it's running. Since we s/the number of/all/ (or secondary) > +need to disable nonboot CPUs during the hibernation, if this process is not > +frozen, it may notice that the number of CPUs has changed and may start to work > +incorrectly because of that. > + > +V. Are there any problems related to the freezing of tasks? > + > +Yes, there are. > + > +First of all, the freezing of kernel threads may be tricky if they depend one > +on another. For example, if kernel thread A waits for a completion (in the > +TASK_UNINTERRUPTIBLE state) that needs to be done by freezable kernel thread B > +and B is frozen in the meantime, then A will be blocked until B is thawed, which > +may be undesirable. That's why kernel threads are not freezable by default. > + > +Second, there are the following two problems related to the freezing of user > +space processes: > +1. Putting processes into an uninterruptible sleep stuffs up the load average. s/stuffs up/distorts/ ('Stuffs up' is accurate as a colloquialism, but I'm suggesting the change because the language in the remainder of the file is more formal - this seems out of place). > +2. Now that we have FUSE, plus the framework for doing device drivers in > +userspace, it gets even more complicated because some userspace processes are > +now doing the sorts of things that kernel threads do > +(https://lists.linux-foundation.org/pipermail/linux-pm/2007-May/012309.html). Death to them all, I say! :) > +The problem 1. seems to be fixable, although it hasn't been fixed so far. The > +other one is more serious, but it seems that we can work around it by using > +hibernation (and suspend) notifiers (in that case, though, we won't be able to > +avoid the realization by the user space processes that the hibernation is taking > +place). > + > +There also are problems that the freezing of tasks tends to expose, although s/also are/are also/ > +they are not directly related to it. For example, if request_firmware() is > +called from a device driver's .resume() routine, it will timeout and eventually > +fail, because the user land process that should respond to the request is frozen > +at this point. So, seemingly, the failure is due to the freezing of tasks. > +Suppose, however, that the firmware file is located on a filesystem accessible > +only through the device that needs the firmware. In that case, the system won't > +be able to work normally after the restore regardless of whether or not the > +freezing of tasks is used. Consequently, the problem is not really related to > +the freezing of tasks, since it generally exists regardless. [The solution to > +this particular problem is to keep the firmware in memory after it's loaded for > +the first time and upload if from memory to the device whenever necessary.] I understand the logic and agree with that you're trying to say in this last example, but think the example is faulty. If the firmware is on a filesystem accessible only through the device that needs the firmware, then you wouldn't be able to bring it up in the first place. Regards, Nigel
Attachment:
signature.asc
Description: This is a digitally signed message part
- Follow-Ups:
- Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- From: "Rafael J. Wysocki" <[email protected]>
- Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- References:
- [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- From: "Rafael J. Wysocki" <[email protected]>
- Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- From: "Rafael J. Wysocki" <[email protected]>
- Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- From: Pavel Machek <[email protected]>
- Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- From: "Rafael J. Wysocki" <[email protected]>
- [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- Prev by Date: Re: [2/4] 2.6.22-rc3: known regressions
- Next by Date: [PATCH] zs: Move to the serial subsystem
- Previous by thread: Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- Next by thread: Re: [RFC][PATCH][EXPERIMENTAL] Make kernel threads nonfreezable by default
- Index(es):