|
|||||||||||||
|
Updated: 7/10/2005
Disclaimer: These notes are a project in development -- primarily
to myself as I try to understand the process of implementing a Coraid
etherdrive.
If you happen across these notes, treat any information contained herein with
the appropriate distrust. Neither I or my organization are responsible for
errors in this document or their consequences.
Implementing a Coraid Etherdrive Storage Blade unitWe purchased an Etherdrive unit with 10 Parallel-ATA drive bays in order to provide disk storage for disk-to-disk backups and for supplemental storage for our web server.For disk-to-disk backups, we usually do a monthly, daily, weekly backup scheme and do not really need RAID storage at this time. To connect the Etherdrive unit, I chose to create a separate LAN independent of our office LAN. I chose the 3COM 4226T switch because it meets CORAID's suggestion that the switch have 802.3ad (port aggregation or LACP) and 802.3x (flow control). The price was reasonable, and the switch has sufficient ports to dedicate a port for each blade and a port for each server even if we end up with a 1:1 correspondance between servers and blades. (the switch has 24 ports total). We're starting with Fedora Core Linux 5, so I downloaded the 'aoe-2.6-31.tar.gz' from the CORAID support site.
The AOE drivers for LinuxTo install AOE for our Fedora 2.6 kernel (SMP):
The AOEtoolsThe AOEtools are included in the aoe-2.6.31.tar.gz tar file: tools:
The HardwareI populated the Etherdrive chassis with 10 200GB drives and placed in our rack alongside the 3COM switch. I used the bottom row of switch connections for the drive shelves and the top row to connect to other machines.I also used the switch for mailserver3 to send a heartbeat signal to mailserver4 and vice-versa, so this was a good test that the switch worked. The Etherdrive unit itself needs a shelf address.. This is set with dip switches on the back of the etherdrive unit. Since this is our only unit, I set the shelf address to 0. shelf addresses allow multiple etherdrive units to be connected to the same switch and lan -- and the shelf address is also a part of the /dev/etherd/nnn block device. For example, /dev/etherd/e0.1 refers to the 1st blade on shelf 0. Setting up a partitionI installed the etherdrive software on mailserver4 (running Fedora Core 3) and rebooted. The devices should show up in tt>/dev/etherd. If they don't, make sure the following lines are in /etc/modprobe.conf.alias block-major-152 aoe alias char-major-152 aoe # These rules tell udev what device nodes to create for aoe support. # They may be installed along the following lines (adjusted to what # you see on your system). # # ecashin@makki ~$ su # Password: # bash# find /etc -type f -name udev.conf # /etc/udev/udev.conf # bash# grep udev_rules= /etc/udev/udev.conf # udev_rules="/etc/udev/rules.d/" # bash# ls /etc/udev/rules.d/ # 10-wacom.rules 50-udev.rules # bash# cp /path/to/linux-2.6.xx/Documentation/aoe/udev.txt \ # /etc/udev/rules.d/60-aoe.rules # # aoe char devices SUBSYSTEM="aoe", KERNEL="discover", NAME="etherd/%k", GROUP="disk", MODE="0220" SUBSYSTEM="aoe", KERNEL="err", NAME="etherd/%k", GROUP="disk", MODE="0440" SUBSYSTEM="aoe", KERNEL="interfaces", NAME="etherd/%k", GROUP="disk", MODE="0220" # aoe block devices KERNEL="etherd*", NAME="%k", GROUP="disk" With the device driver in place, and the /dev/etherd drivers possibly present, I was able to do fdisk /dev/etherd/e0.1 and create two partitions: Disk /dev/etherd/e0.1: 251.0 GB, 251000193024 bytes 255 heads, 63 sectors/track, 30515 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/etherd/e0.1p1 1 31 248976 83 Linux /dev/etherd/e0.1p2 32 30515 244862730 83 Linux In our disk-to-disk backup scheme, I wanted to use the autofs facility to mount our backup devices in /daily, in /weekly, and in monthly. I wanted the "daily" backup device to be the 1st etherdrive blade, and so on mailserver3 I added an entry to /etc/auto.master for the /daily directory like this: # $Id: auto.master,v 1.2 1997/10/06 21:52:03 hpa Exp $ # Sample auto.master file # Format of this file: # mountpoint map options # For details of the format look at autofs(8). /daily /etc/auto.daily /misc /etc/auto.misc /r /etc/auto.direct /workstation /etc/auto.workstation mailserver3.acd.ucar.edu:/home/fredrick/aoetools-4> cat /etc/auto.daily # # $Id: auto.misc,v 1.2 2003/09/29 08:22:35 raven Exp $ # # This is an automounter map and it has the following format # key [ -mount-options-separated-by-comma ] location # Details may be found in the autofs(5) manpage var -fstype=ext3,rw :/dev/etherd/e0.0p1 home -fstype=ext3,rw :/dev/etherd/e0.0p2 etc -fstype=ext3,rw :/dev/etherd/e0.0p3 usr -fstype=ext3,rw :/dev/etherd/e0.0p5 free -fstype=ext3,rw :/dev/etherd/e0.0p6 An rsync command verified that I could populate the partition with files. So it was a simple matter to create a daily backup script to rsync to the above filesystems. Unmounting early in shutdownIt is important in the shutdown and reboot scripts to unmount a drive early -- before the network is lost. Otherwise, open inodes, etc., may exist and there would be no network to which to write them to disk. Using autofs seemed like the best solution to me -- autofs would be shut down before the network in most cases.Note that when creating RAID devices, the initialization of these devices should be after the initalization of the network. This cannot be done in /etc/raidtab, so CORAID recommends setting up a raid configuration separately (such as in /etc/rt) and mounting from rc.local (e.g., raidstart -c /etc/rt /dev/md1 ; mount /dev/md1 /mnt/raid -- see http://www.coraid.com/support/linux/EtherDrive-2.6-HOWTO.html PerformanceCommunicating with an AOE drive is of course quite a bit slower than communicating with a directly attached drive. Performance can be improved somewhat by striping multiple drives together in a RAID configuration, but the bottleneck is more or less the 100MBps network interface. Etherdrive partitions should not be used where performance is critical.Also note that fsck's could take a very long time because of the slower performance of the drives. You can set these options: mailserver4:/etc/rc.d> sudo tune2fs -i 300 /dev/etherd/e0.1p2 tune2fs 1.36 (05-Feb-2005) Setting interval between check 25920000 seconds mailserver4:/etc/rc.d> sudo tune2fs -c 400 /dev/etherd/e0.1p2 tune2fs 1.36 (05-Feb-2005) Setting maximal mount count to 400 Moving drives between serversAn etherdrive (or etherdrives) may be moved between servers simply by unmounting the drive on one server and mounting it on another. It might also be possible for one server to have read/write access to a drive while another has read-only access, but I haven't yet explored that option.TroubleshootingThe aoe-stat program may be used to see the status of the various etherdrive blades (as seen from the perspective of the Fedora Linux aoe device driver module). For example:
mailserver4:/etc/udev/rules.d> sudo aoe-stat
e0.0 eth1 up
e0.1 eth1 up
e0.2 eth1 up
e0.3 eth1 up
CORAID's FAQ also contains a lot of useful troubleshooting information. Another situation that is common is a mismatched kernel. The kernel source for single-CPU kernels is installed with the command yum install kernel-devel and the kernel source for SMP kernels is installed with the command yum install kernel-smp-devel. Taking a look at the output of /sbin/modinfo /lib/modules/2.6.17-1.2157_FC5smp/kernel/drivers/block/aoe/aoe.ko may help. In particular, look at the "vermagic" string. References
|
|||||||||||||