Paul Howarth wrote:
Andy Green wrote:
A story about LVM. I believe LVM is the default on Fedora
partitioning now, at least I didn't love it that much that I would
have selected it, and it is on all my boxes now.
LVM can make a lot of sense for large storage binding together
multiple devices or raids into a single logical storage device, in
fact I use it for that too. However LVM makes less sense on, say, a
laptop which has and will only ever have a single 2.5" HDD for
storage that is permanently available with the laptop.
Now it doesn't matter too much when everything is working, because
LVM is a fairly lightweight additional layer AFAIK. However on a box
here its sole SATA drive went bad without warning, basically some
dozens of sectors were goneski after a recent period of high
temperature here. The resulting symptom was that the partition
contents were no longer recognized as containing a logical volume or
a volume group, nor pvscan, although pvdisplay could see it was a
physical volume if pointed directly at the partition.
Recovery from LVM metadata corruption is not something that is
overburdened by tools to help out, in fact I couldn't find anything
useful. By using dd I probed the damaged region and found that it
started 33214 512-byte blocks into the partition, and ended 33336
512-byte blocks in, it trashed something like 60Kbytes. Touching
this region spewed IO errors to the console. Whether this explained
the loss of LVMness or a subsequent logical brain damage that
happened elsewhere did it I don't know.
What I did was to add a new HDD and install FC5 on it and boot into
it, with the old HDD on as /dev/sdb. I then used dd to copy the
first 33214 512-byte blocks to a file on the new drive, dd'ed 122
512-byte blocks from /dev/zero and appended that on the end of the
first file, and then used dd with bs=512 skip=33336 to copy the
remainder of the damaged partition to this file also. So after this
I had a copy of the partition as a file on the new HDD with
everything in the right place and the damaged area zeroed out.
Now naturally this file will not mount loop because of the LVM, it's
not a valid ext3 image. I googled around some more and went on the
LVM IRC channel and explained my problem. No help, in fact no
response. There don't seem to be any tools or readily findable
advice for recovering from this situation.
I created a new 10MB file with dd and used mkfs.ext3 on it, and
examined the first part of it using hexdump. With the help of Google
I found that the ext3 magic is present at offset +0x438, and I
noticed that the first 1Kbytes of it is zeroed. I then used hexdump
and grep to search for this situation in the copied LVM partition
file, and found such a situation was present at offset 0x30438.
I decided to remove the first 0x30000 bytes of my copied partition
image, which took a while because the partition was 60GB, in fact the
whole process was agonizingly slow.
After this, I was able to mount the resulting file -text3 -oloop
successfully and I recovered my data. The zeroed/damaged region
trashed a small part of two directories whose contents where
noncritical. This story is offered in the hope that future Googlers
will have better luck than I did.
I wouldn't say that LVM is evil from this, but I would suggest that
you simply turn it off for partitioning actions where you know there
will be no expansion, because the only thing it will ever do for you
in that case is to stress you out when you least need it.
Had a similar issue last week actually. It's not put me off LVM but it
made me glad I do regular backups.
Paul.
I use raid-1 devices as my LVM PVs to reduce the risk of such problems
--
"Spend less! Do more! Go Open Source..." -- Dirigo.net
Chris Johnson, RHCE #804005699817957