Re: Back to the future.

On Apr 27, 2007, at 21:15:28, Rafael J. Wysocki wrote:

On Saturday, 28 April 2007 03:03, Kyle Moffett wrote:
On Apr 27, 2007, at 18:07:46, Nigel Cunningham wrote:
But in doing so you make the contents of the disk inconsistentwith the state you've just snapshotted, leading to filesystemcorruption. Even if you modify filesystems to do checkpointing(which is what we're really talking about), you still also havethe problem that your snapshot has to be stored somewhere beforeyou write it to disk, so you also have to either [snip]
When sys_snapshot is run, the kernel does:
1) Sequentially freeze mounted filesystems using blockdevfreezing. If it's an fs that doesn't support freezing then eitherfail or force-remount-ro that fs and downgrade all itsfiledescriptors to RO. Doesn't need extra locking since processwhich try to do IO either succeed before the freeze call returnsfor that blockdev or sleep on the unfreeze of that blockdev.Filesystems are synchronized and made clean.2) Iterate over the userspace process list, freezing each processand remapping all of its pages copy-on-write. Any device-specificpages need to have state saved by that device.
Why do you want to do 2) after 1) and not vice versa?

(1) can be done without extra locking. Device-mapper already hascode to freeze filesystems and that makes a natural process-stoppingpoint. Any threads doing IO will very quickly put themselves tosleep at (1) and save us some effort during step 2.

6) Kernel unfreezes all userspace processes and returns thesnapshot FD to userspace (where it can be read from).
Okay, but how do we do the error recovery if, for example, theimage cannot be saved?


If the image can't be saved then there are 2 options:
  (1)  Call sys_restore() with the image
  (2)  Pass your snapshot file-descriptor to sys_unsnapshot()

In the former case, the system will be restored to the state it wasat a few seconds earlier, right as it took the snapshot. In thelatter case the modified-in-memory snapshot pages will be synced backto the disk filesystems, the copy-on-write data-structures torn down(think of merging an LVM snapshot back into its base device), and thememory allocated for the snapshot will be freed. Either way thesystem is properly in sync with disk again, the only difference iswhether you want to preserve the userspace state from during theattempted snapshot (IE: any error status). You could also save theerror state in case (1) by just auto-posting a bug-report on http://bugs.$VENDOR.com/ of course :-D.


Cheers,
Kyle Moffett

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

References:
- Back to the future.
  - From: Nigel Cunningham <[email protected]>
- Re: Back to the future.
  - From: Nigel Cunningham <[email protected]>
- Re: Back to the future.
  - From: Kyle Moffett <[email protected]>
- Re: Back to the future.
  - From: "Rafael J. Wysocki" <[email protected]>

Prev by Date: Re: Fw: [PATCH] ia64: race flushing icache in do_no_page path
Next by Date: Re: Back to the future.
Previous by thread: Re: Back to the future.
Next by thread: Re: Back to the future.
Index(es):
- Date
- Thread

[Index of Archives] [Kernel Newbies] [Netfilter] [Bugtraq] [Photo] [Stuff] [Gimp] [Yosemite News] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Video 4 Linux] [Linux for the blind] [Linux Resources]