Re: [PATCH] Remove process freezer from suspend to RAM pathway

On Jul 06, 2007, at 11:42:33, Alan Stern wrote:

On Thu, 5 Jul 2007, Kyle Moffett wrote:
Umm, this thread is NOT ABOUT HIBERNATING!!! Please go back andread the subject, specifically the "suspend to RAM" parts :-D.
But it _is_ about the freezer; see the "Remove process freezer"part. :-) Since the freezer is used during hibernation,hibernation is a legitimate topic.

Except Linus already decreed (and I heartily agree) that hibernationand suspend-to-RAM were fundamentally completely different operationsand therefore any attempts to share code were basically just making abig muddy mess of things. Would a thread "Remove phase-of-the-mooncalculations from network-recv code" be relevant to lunar observationjust because the two had to do with the phases of the moon? No!

When your hardware can put itself to sleep and atomically preservememory as it does so, you don't need an atomic copy. For RealSuspend(TM) (IE: Suspend-to-RAM), the list of things to do isshort and simple:
1) Stop DMA and put most hardware into low-power states (stopsall interrupt sources)2) Ensure that the other CPUs have finished any trailinginterrupt handlers and put them to sleep
3)  Put the interrupt-controllers into low-power stat
4)  Go to sleep
Your short and simple list omits a few crucial items:

A)  Decide what to do about remote wakeup requests.

Why do we care? If the wakeup request arrives before we go to sleep,we obviously aren't asleep and so can't wake up. If it arrives afterwe go to sleep then it will wake us up. Anything that depends on awakeup arriving mid-sequence is 100% masochistic race condition.

B) Prevent I/O requests from resuming devices that have beensuspended.

(1a) As I describe below, step (1) includes setting NO_BIND and NO_IOflags on devices as they are processed. Anybody who wants to do IOwhile those flags are set should just go sleep on a waitqueue.

C) Prevent devices and drivers from being registered orunregistered; in particular decide what to do about hot-plug or hot-unplug events.

(1b) Again, that's where the NO_BIND flag comes in. If its set thenany device probe events must sleep, otherwise they can go through.

D)  Block driver bind or unbind calls.


See points (1a) and (1b) above.

Any of these things is capable of screwing up the course ofevents. (In fact A _should_ be allowed to abort a suspend.)

If any of those things screw up suspend-to-RAM then it is 100% thedrivers fault and no "process freezer" is going to fix it, end ofstory. And "A" cannot be made reliable. At some point you shut offinterrupts right before going to sleep, and at that point any remotewakeup event is just going to get dropped until you actually entersleep mode and the hardware takes over again. If you miss a wakeupevent then whatever sent it should just retry, just as with *every*other kind of network packet.

How about a freezer whose job it is to "wait for pending hardinterrupts to complete when we have already guaranteed that wewon't get any more"? That part should be really *REALLY* easy.You don't need to care about either userspace processes or kernelthreads at all. Specifically, Step 1 consists of:
suspend_device(dev)
{
	set_no_bind_flag(dev);
	for (dev->subdevices)
		suspend_device(dev);
	set_no_io_flag(dev);
	wait_for_in_progress_dma(dev);
	turn_off_interrupts(dev);
	go_to_low_power_state(dev);
}
After you've set the "no_bind" flag, you won't get any *new*subdevices trying to bind,
So what happens if a new subdevice arrives at the wrong time? Doyou block instead of binding it? While holding a mutex needed tosuspend the parent device?

That would be a driver bug. If you have asynchronous probing thenproper suspend handling includes being able to postpone driver probeevents until after resume. If you have synchronous probing then theproblem doesn't exist because "set_no_bind_flag" is just telling thedevice not to raise any more device probe interrupts.

What about drivers trying to bind to existing devices?

While binding it will clearly be holding a mutex/spinlock on theparent device, so the suspend process will wait for it. When bindingis done the suspend_device() code will take the device lock and telleverything else to postpone further bind requests as above.

What happens to I/O requests submitted after the "no_io" flag isset? The driver will have to block them, effectively creating itsown little "freezer".

Oh, so you're calling every waitqueue in the kernel a "freezer" now?We do these things at the driver level *all* *the* *time*. Forinstance, you can't submit new IOs to an ATA controller while it'srenegotiating the bus speed, but that's never been a problem before.

When all the leaf devices are off, the parent device can be turnedoff because everything waiting on the leaf devices is blocked onthem and won't unblock until the parent device *AND* the leafdevice are turned on again, in that order.
This is a lot like what we already do.  The differences are:

There is nothing corresponding to your "no-bind" flag.
Most drivers don't have anything like your "no_io" flag; theyassume that nobody will be around to submit an I/O request.

Most drivers have an implicit NO_BIND flag: The device's interruptsare off and/or its in a low-power state. USB is already terriblybuggy with regards to suspend: If you hotplug a device duringsuspend (like the touchpad in my powerbook powering down/up), thenthe USB stack will basically hang that controller. The device is offand the hotplug triggers interrupts and IO, *EVEN* *WITHOUT**USERSPACE*.

So if your driver doesn't already have a proper way of blocking IOduring suspend then it probably doesn't suspend 50% of the timeanyways. A bug which bites *every* *time* is easy to fix, one whichonly bites when things hit a race condition is much harder.

Resuming is basically running the whole process in reverse.Runtime-suspend is achieved by not setting the 'no_io' or'no_bind' flags and putting selective device-subtrees to sleepwithout doing anything to the rest of the system.
Nobody doubts that suspend can be made to work without the freezer.The point is that doing it this way dumps a bunch of extraresponsibility on drivers.

That responsibility has been there ever since suspend-to-RAM supportwas added. Nobody ever denied that writing a proper driver wasn'ttricky. You have to simultaneously be able to handle handle hot-unplug, IO errors, interrupts, IO requests, suspend-to-RAM, andhibernation. If your driver mutual-exclusion is buggy then itprobably already bites you during hotplug or other similarscenarios. Let's at least make the problems much more reproducibleso we can fix the drivers properly instead of continuing to kludgearound it for all eternity.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Follow-Ups:
- Re: [PATCH] Remove process freezer from suspend to RAM pathway
  - From: Alan Stern <[email protected]>

References:
- Re: [PATCH] Remove process freezer from suspend to RAM pathway
  - From: Alan Stern <[email protected]>

Prev by Date: [PATCH] Serial 8250: Handle saving the clear-on-read bits from the LSR and MSR
Next by Date: Re: [1/2] 2.6.22-rc7: known regressions
Previous by thread: Re: [PATCH] Remove process freezer from suspend to RAM pathway
Next by thread: Re: [PATCH] Remove process freezer from suspend to RAM pathway
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]