On Fri, Jul 23, 2004 at 04:27:20PM -0400, Bryan J. Smith wrote: > I have a client with an age-old RHL 6.2 system (specs below). > I'm considering replacing the storage array (specs below) > and moving the system to Fedora Core 2 with LVM 2. > > - LVM2+Snapshots, close to what NetApp had 4 years ago? > > I'd really like to take advantage of snapshots, both for > backup and accidental file deletion purposes. They are used > to NetApp filers, with the ability to restore files by > mounting the snapshot filesystem. But cost is everything > now. How good is LVM2 at this in comparison to where NetApp > was 4 years ago? There are fundamental differences between what a NetApp filer is doing, and what LVM2 snapshots provide. In particular, when using LVM2 snapshots, kcopyd has to constantly move blocks from your filesystem LV to the snapshot LV. Device Mapper is much more sensible and efficient at this than LVM1, but it is still non-trivial overhead, and ends up generating a lot of mixed read/write traffic. We are currently using NFS/Ext3/LVM2/MD on a 2.6.8-rc1 kernel as our backup NFS server, and initial testing with snapshots under load uncovered some performance problems that I need to track down. [Snapshots and mirroring were only recently added to the Device Mapper code in the Linus kernel tree.] Either grab the most recent kernel from kernel.org, or an FC3 development kernel, and test extensively. The NetApp WAFL filesystem encapsulates all meta-data in a tree structure, and uses persistent copy-on-write multi-rooted trees. When writing, it places data wherever it is convenient (i.e., in the free space), and then adjusts block pointers up toward the root of the tree. Every few seconds it checkpoints its state (i.e., takes a snapshot). [The NetApp also uses NVRAM to hold state that hasn't been flushed to disk.] When one wants to save a snapshot, the filesystem tags it and maintains its allocation data, instead of releasing stale blocks back into the free pool. For more info on the NetApp filer filesystem, see the original whitepaper: http://www.netapp.com/tech_library/3002.html Based on what I've read of Reiser4, the design should allow a similar level of functionality to be incorporated at some point. Unfortunately, it is not done yet. To summarize: LVM2 will do what you want (modulo some tuning and perhaps bug fixes), but it is not an NetApp. > With that said, which is better for LVM2, Ext3 or XFS? > I've always been a closet fan of XFS on Linux with all its > inherent capabilities, but if Ext3 is better for LVM2 in > FC2, then I want to stick with Ext3. IIRC, XFS does not do data journaling. So while it may be much faster than Ext3, you need to consider data integrity. > I'm also not against using something other than LVM2 if it > is better for XFS, as long as it is GPL (I wasn't aware > anything was other than LVM2, so let me know if I'm > mistaken). I haven't been following EVMS development, but you might want to look into the current state of affairs to find out if there is any functionality there that you need (e.g., badblock handling). > [ Yes, I know, I'll need to build the 3w-9xxx driver as > it wasn't included until later FC2 kernel releases. I'll > use a "helper ATA disk" to install FC2 and then install > a newer kernel with the 3w-9xxx driver. I figured I might > need to do this for LVM2 anyway (unless the FC2 installer > has LVM2 all integrated? I didn't think it did?) ] LVM2 installs work fine. Some things you might want to do: 1. Script some infrastructure to monitor snapshot space usage. 2. Cron a job to snapshot and fsck the filesystem, so any filesystem problems are revealed early. 3. If using Ext3 with data journaling, specify a large journal when creating the filesystem (e.g., mke2fs -j -J size=400 ...). 4. Tune the filesystem and VM variables: flush time, readahead, etc. 5. Test whether an external journal in the form of an NVRAM card or additional disks would improve performance. (You can try with a ramdisk for test purposes). Regards, Bill Rugolsky