Wednesday, July 06, 2005

Configuring NeoPath for multi-tiered storage

The NeoPath Solution

We have embarked on a path whereby data is no longer incrementally backed up to tape, but rather backed up to a second tier of disk space. The reasoning is that tape technology is no longer able to keep up with disk technology and pricing. As our primary storage grows year over year, a long term strategy for backing it up is required. Tape is still used, but it is now regulated to archival purposes only, and at long intervals at that.

There are various 1st tier NAS solutions we are using, from NetApp to commodity storage with and without their own snapshot technologies. However, in most cases, snapshots are available for fine-grained file level backups that cover hourly changes or at least multiple images of the filesystem per day. However, each solution tends to have its own snapshot directories, and not all are relative to each directory in a volume. In some cases, snapshots are only available at the root of a volume.

Our 2nd tier solutions are even more commodity, consisting of standard journaling filesystems on large volumes, representing at least double the capacity of its matching first tier. We provide backups using hard-link style snapshots using rsnapshot. These pull an initial copy from the first tier either with rsync or via an NFS mount and periodically (daily) copy over deltas into new dated directories, preserving untouched files with hard links. 2nd tier storage is coarser, representing daily snapshots over multiple months of incremental diffs. At regular intervals (6 months on average), a 2nd tier snapshot is used as the source of a tape archival. Again, these snapshots tend to differ from the first tier in its layout and directory structure. More importantly, the 2nd tier systems are not directly accessible by the end users.

Finally, the multi-tier solution breaks out storage onto multiple distinct servers. How do end users know where to get their data, and how can they acquire self-serve restorations? The solution we have found is NeoPath. This product acts as an NFS or CIFS aggregator, allowing new logical paths to be made to consolidate storage into a single logical tree if necessary. It also provides for live data migration between servers, so it protects ones continuing investment of 1st and 2nd tier storage solutions, allowing for the acquisition of new 1st tier storage and migration of older storage to the backend or out of service entirely. The migration concept can be a critical feature when failing hardware needs to be replaced and its impossible to disentangle a system from centralized storage services. Other features include defining virtual servers, synthetic directory trees, and synthetic links and unions formed from back end mounted file servers.

Minor Quibbles

For all of its advantages, the NeoPath product still has its faults. Primarily, design decisions made to do the right thing can at first get in the way of its basic implementation. The first problem we noticed is that NeoPath on its face requires each backend share to be read-write since it needs to store meta data about that share on each back end file server (within a hidden directory). In the case of snapshot file systems directly exported, you are only given a read-only system by design of the NAS. Second, even in the case of our 2nd tier hard-link level file snapshots, we do not wish to expose that filesystem to the outside as read-write. When you re-export multiple tiers into a single tree, clients are generally given read-write permissions to their primary directories, and its only possible to enforce read-only permissions on parts of that tree if the NeoPath itself is reading it read-only.

The other problem is that although clients can mount at any permissible point in a share, the NeoPath product itself only permits inclusion of back end directories into its virtual trees via the explicitly defined exports of the back end file servers. You can not directly reference a deeper directory in the formation of unions or synthetic links. It also will only honor the first mount point in encounters in traversing a back end file server. Thus, you can not get around the issue by defining multiple levels of exported points to mount from.

All Problems Have Solutions

We have successfully resolved these issues with a little guidance from NeoPath. Overtime, I hope to refine and further explain these solutions.

First, whereas the web-based GUI does not let you explicitly define certain options on accessing back end servers, the command line environment does. You need to consult the command line reference guide, but the gist is that to allow for read-only volumes, one can define alternative dstorage locations (NeoPath speak). What we did was define a small read-write share from a NAS that was 64MB in size. That was the smallest available volume offered by our NAS, so you can likely go smaller. This volume will serve as metadata storage for all file servers if you instruct the NeoPath product to do so. Now, we can pull in any type of share regardless of write permissions.

The second problem has a similar but more involved solution. One can get past the issue of deep linking on back end servers by creating another minimal volume where symbolic links will be created. The idea is that when creating a synthetic directory, one should mount all back end file servers in a uniform path. The primary paths in the synthetic directory that users will see should be created in the small volume, utilizing relative symbolic link references to the uniform paths including any necessary deep references. An example would be useful here. Take this directory structure:

/myorg/users : (/myorg/users/jlittle -> /myorg/tier/1/vol7/users/jlittle)
/myorg/backup/tier1 : (/myorg/backup/tier1/users/jlittle -> /myorg/tier/1/vol7/.snap)
/myorg/backup/tier2 : (/myorg/backup/tier2/users/jlittle -> /myorg/tier/2/vol7)
/myorg/tier/1 : (contains vol1 through vol7 mounts of tier 1 system)

/myorg/tier/2 : (contains vol1 through vol7 mounts via a union of tier 2 systems)


In the above example, the users and backup trees are synthetic links to two small volumes defined on a NAS for the purpose of generating symbolic link trees. The last two lines are direct synthetic links to back end 1st and 2nd tier storage. The various backup points are actually links to the head of each snapshot volume, as the user first needs to traverse into a date-labeled directory before proceeding in /users/username or the equivalent. To generate the symbolic link tree, I took output from file listings that show actual relative paths per user (eg: ../vol7/users/jlittle) and built a little script to be run from a system mounting the base tree of /myorg.

#!/bin/bash

MNTDIR=/mnt
SRCFILE=/root/users-lists

for LINE in `cat $SRCFILE`
do
VOL=`echo $LINE | awk -F'/' '{ print $2 }'`
USER=`echo $LINE | awk -F'/' '{ print $4 }'`
echo $VOL $USER
cd $MNTDIR/users
ln -sf ../tier/1/$VOL/users/$USER $USER
cd ../backup/tier1/users/
ln -sf ../../../tier/1/$VOL/.snap $USER
cd ../../tier2/users/
ln -sf ../../../tier/2/$VOL $USER
done


The only issue left is to provide easy maintenance of this link list as users move around. Its an exercise left to administrators to tie this into their account creation and migration scripts/processes.

Followers