Re: When LVM Goes Bad
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Andy Green wrote:
A story about LVM. I believe LVM is the default on Fedora partitioning
now, at least I didn't love it that much that I would have selected it,
and it is on all my boxes now.
LVM can make a lot of sense for large storage binding together multiple
devices or raids into a single logical storage device, in fact I use it
for that too. However LVM makes less sense on, say, a laptop which has
and will only ever have a single 2.5" HDD for storage that is
permanently available with the laptop.
Now it doesn't matter too much when everything is working, because LVM
is a fairly lightweight additional layer AFAIK. However on a box here
its sole SATA drive went bad without warning, basically some dozens of
sectors were goneski after a recent period of high temperature here. The
resulting symptom was that the partition contents were no longer
recognized as containing a logical volume or a volume group, nor pvscan,
although pvdisplay could see it was a physical volume if pointed
directly at the partition.
Recovery from LVM metadata corruption is not something that is
overburdened by tools to help out, in fact I couldn't find anything
useful. By using dd I probed the damaged region and found that it
started 33214 512-byte blocks into the partition, and ended 33336
512-byte blocks in, it trashed something like 60Kbytes. Touching this
region spewed IO errors to the console. Whether this explained the loss
of LVMness or a subsequent logical brain damage that happened elsewhere
did it I don't know.
What I did was to add a new HDD and install FC5 on it and boot into it,
with the old HDD on as /dev/sdb. I then used dd to copy the first 33214
512-byte blocks to a file on the new drive, dd'ed 122 512-byte blocks
from /dev/zero and appended that on the end of the first file, and then
used dd with bs=512 skip=33336 to copy the remainder of the damaged
partition to this file also. So after this I had a copy of the
partition as a file on the new HDD with everything in the right place
and the damaged area zeroed out.
Now naturally this file will not mount loop because of the LVM, it's not
a valid ext3 image. I googled around some more and went on the LVM IRC
channel and explained my problem. No help, in fact no response. There
don't seem to be any tools or readily findable advice for recovering
from this situation.
I created a new 10MB file with dd and used mkfs.ext3 on it, and examined
the first part of it using hexdump. With the help of Google I found
that the ext3 magic is present at offset +0x438, and I noticed that the
first 1Kbytes of it is zeroed. I then used hexdump and grep to search
for this situation in the copied LVM partition file, and found such a
situation was present at offset 0x30438.
I decided to remove the first 0x30000 bytes of my copied partition
image, which took a while because the partition was 60GB, in fact the
whole process was agonizingly slow.
After this, I was able to mount the resulting file -text3 -oloop
successfully and I recovered my data. The zeroed/damaged region trashed
a small part of two directories whose contents where noncritical. This
story is offered in the hope that future Googlers will have better luck
than I did.
I wouldn't say that LVM is evil from this, but I would suggest that you
simply turn it off for partitioning actions where you know there will be
no expansion, because the only thing it will ever do for you in that
case is to stress you out when you least need it.
Had a similar issue last week actually. It's not put me off LVM but it
made me glad I do regular backups.
[Index of Archives]
[Current Fedora Users]