LVM: good, LVM snapshots: bad
Well, today I was looking into using LVM snapshots to allow a client OCN use Linux as a Netapp replacement…. Boy was I in for a disappointment.
LVM an sich is working great, but the moment you turn on snapshots the (in this case) write performance goes to hell. Using LVM is easy enough. The system I was on has 32 GB ram and 2 disk arrays with hardware RAID.
Setting up:
pvcreate /dev/sdb1
pvcreate /dev/sdc1
vgcreate volume1 /dev/sdb1 /dev/sdc1
vgchange -a y volume1
lvcreate -n lv-vol1 -L1.00T volume1
mke2fs -j /dev/volume1/lv-vol1
mount /dev/volume1/lv-vol1 /vol
and you are ready. Creating a snapshot can be done with
lvcreate -s -n snap-lv-vol1 -L50G volume1
Sadly enough such a snapshots isn’t enlarged automatically when it turns
out to be too small (Netapp has this feature). With a lvresize
this
can be easily remedied.
Making a snapshot with the normal chunk size
of 8.0 KB kill performance once in a while you will see a kcopyd
process which is copying blocks from your live file system to the
snapshot(s). I’ve witnessed a hanging file system, which took minutes to
spring back to live – totally not acceptable. Even when LVM works
okay, turning on snapshots kills performance.
If you search for “LVM linux snapshot performance” you will get a lot of complaints, but no solutions. In short: LVM is often billed as the solution for everything and those very handy snapshots are great to have. Too bad it turns out these snapshots are not production ready. I’m totally not comfortable to deploy LVM and snapshots in any production environment.
To fake the snapshots we are going to deploy rsnapshot which uses
rsync
to fake snapshots.
We were also really tempted to deploy OpenSolaris with ZFS, the only thing holding us back is that this would introduce another platform that must be administered. Other than that:
I want something like ZFS on Linux!
(Yes, I know of btrfs)