[SATLUG] Adventures in LVM/RAID....

John Pappas j at jvpappas.net
Sat Jul 26 00:44:08 CDT 2008

Hey all,

This post is basicly written to pass on my experiences manipulating and
doing (potentially stupid) things with my data.  This is a really long post,
so if you are not interested, I understand.  It is also late and I am not at
the tip of my writing game.

Executive abstract: using LVM and mdadm I executed a no-downtime data
mirgation and raid creation.  I am not trying to brag (look how big my RAID5
is) or otherwise impress, but rather demonstrate how impressive the Linux
disk management subsystems have gotten, and some of the data aerobics that
are possible with both LVM and kernel RAID.  This entire operation was done
online in runlevel 5 with VMWare, NFS, and FTP services still running.

First the back story:

I recently acquired a couple 500GB WD "Green" disks, sata controller, and
Antec 900 case (Awesome case incidently) to contain the existing 2x 500GB
disks and 2x 250G SATA.  The smaller disks were mirrored and the 500's were
just JBOD.  I wanted to set up a RAID1:/dev/md1 (2x250) and RAID5:/dev/md2
(4x500).  End result was to be

As I have expressed my love of LVM, I had set up the system with 2 VGs
(sys/data).  I had data on 2 (/dev/sd[ij]) of the 500's and the new pair was
empty, so I:

`mdadm -c /dev/md2 --level=5 --devices=3 /dev/sd[kl] missing` =  Create a
"degraded" RAID5 array consisting of 2 disks and "missing" or ghost disk

I then executed:
`pvcreate /dev/md2` = Make new raid group LVM capable
`vgextend data /dev/md2` = Add md2 to data VG
`pvmove /dev/sdi /dev/md2` = Move the allocated extents on sdi to md2

This little operation took about 16 hours, since I was both moving extents
and building a degraded RAID5 array. (No downtime, minus the actual recasing
and HDD additions)

Once the rebuild and pvmove was done:
`vgreduce data /dev/sdi` = removes sdi from the VG
`pvremove /dev/sdi` = removes LVM from disk
`mdadm /dev/md2 -a /dev/sdi` = Adds now vacated disk to RAID5 array, and
initates a rebuild (This is only a Athlon64x1 2Ghz, so it used 25% CPU to do
the xor during the rebuild)
`pvmove /dev/sdj /dev/md2` = Moves extents from sdj to md2

This took a long time, as again I was pressing my HDDs with both a 1Tb RAID5
rebuild and 500GB PV Move.

Once Rebuild and PVmove round 2 was complete (again no downtime)
`vgreduce data /dev/sdj` = removes sdj from the VG
`pvremove /dev/sdj` = removes LVM from disk
`mdadm --add /dev/md2 /dev/sdj` = Adds new disk as "spare"
`mdadm --grow /dev/md2 --raid-devices=4` = Restripes the array over existing
3 disks plus the new spare.  md2 is larger, but the PV remained the same
`pvresize /dev/md2` = Adds the new capacity to the PV so that the new
extents can be allocated.

This restripe process took about 5067 minutes (yep a couple of days) but the
CPU was not tasked as it was going really slow, given the SATA2 drives and
conrollers involved, but did not hinder accessibily too much.  I did not run
any I/O tests to determine throughput, as I was just testing to see if this
would work in the first place.

So now this is what I have (the drive letters changed after a recent
unrelated reboot):

# cat /proc/mdstat
Personalities : [raid1] [raid0] [raid6] [raid5] [raid4] [linear]
md2 : active raid5 sdl1[3] sda1[0] sdb1[2] sdd1[1]
      1465151808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU]

md0 : active raid1 sdc1[0] sdk1[1]
      136448 blocks [2/2] [UU]

md1 : active raid1 sdk3[1] sdc3[0]
      290904896 blocks [2/2] [UU]

unused devices: <none>

# pvs
  PV         VG   Fmt  Attr PSize   PFree
  /dev/md1   sys  lvm2 a-   277.43G 136.00M
  /dev/md2   data lvm2 a-     1.36T 481.57G

 # lvs
  LV     VG   Attr   LSize   Origin Snap%  Move Log Copy%  Convert
  video  data -wi-ao 465.70G
  video1 data -wi-ao 450.00G
  export sys  -wi-ao  95.29G
  ftp    sys  -wi-ao  25.00G
  home   sys  -wi-ao  10.00G
  root   sys  -wi-ao   7.00G
  vm     sys  -wi-ao  80.00G
  vmdata sys  -wi-ao  60.00G

So that's it.  Neat huh?

More information about the SATLUG mailing list