Wednesday, June 04, 2008

Recommended Disk Controllers for ZFS

Since I've been using OpenSolaris and ZFS (via NexentaStor, plug plug) extensively, I get a lot of emails asking about what hardware works best. There have been various postings on the opensolaris and zfs lists to the same effect. A lot of people reference the OpenSolaris HCL lists which leave the average user scratching their head with more questions than answers. More to the point, the HCL doesn't tend to answer the more direct question of what hardware should I get to build a ZFS box, NAS, etc. Its important to note that in the case of ZFS, all that extra checksum, fault management, and performance goodness can be negated by selecting a "supported" hardware RAID card. Worse yet, many RAID cards are not fully interchangeable on the spot. What do you want for ZFS?

First, pick any 64-bit dual core or better motherboard or processor. If you can get ICH6+, nvidia, or Si3124-based on board SATA, then you are in good shape for your basic ZFS box with on-board SATA for your system disks alone. System disk can tend to be low 5400RPM 2.5 inch SATA-I drives. Many people then desire some large memory, battery-backed RAID card, and my tests with the high end LSI SAS cards show that memory on the RAID card doesn't do you as much good as having a recipe of lots of system RAM, a sufficient number of cores, many disk drives for the spindles, and sufficient use of the PCIX/PCIe bus using JBOD only disk controllers. I'll cover the controllers next, but I'd recommend at this point 4GB of RAM minimum, dual core at greater than 2ghz, and for any good load, at least two PCI-X or multi-lane PCIe card.

Disk controllers are where the real questions are asked. Over multiples iterations, heavy use, and some anecdotal evidence, we are down to some sweet spots. For PCI-X, there is one game in town, the Marvell-based AOC-SATA2-MV8, used in the X4500. At $100 for 8 JBOD SATA-II ports, it just works and is fault managed. Stick just SATA-II disks on these, and keep any SATA-I disks on the motherboard SATA ports for system disks. I'll add that various Si3124 based cards exist here, but not for sufficient port density.

SuperMicro AOC-SATA2-MV8 link

When it comes to PCIe, there isn't any good high port count options for SATA. If you need just 2 ports, or eSATA, there are various solutions based on the Si3124 chipset, and SIIG makes many of them for $50 each. However, in the PCIe world, the real answer is SAS HBAs that connect to internal or external mixed SAS/SATA disk chassis. Again, most SAS HBAs are either full fledged RAID without JBOD support, or simply don't work in the OpenSolaris ecosystem. 3ware is a lot cause here. The true winner for both cost and performance, while providing the JBOD you want, is the LSI SAS3442E-R.

CDW catalog link for LSI 3442ER
LSI 3442ER product page

Its $250, but I've seen it as low as $130. 8 channels, with both 2 internal ports (generally 8 drives are connected to a single SAS port) as well as the external port. You can use this with an external SAS-backed array of SATA drives from Promise, for instance, to easily populate 16 or 32 drives internally, with an additional 48 drives externally, just from the one card. Would I suggest that many on that single card? No, but you can. Loading up your system with 2 or 4 of these cards, which are based on the LSI 1068 chipset that is well supported by Sun is the best way forward for scale out performance. I was given some numbers of 200MB/sec writes and 400MB/sec reads on an example 12-drive system using RAIDZ. Good numbers, as I got 600MB/sec reads on a 48-drive X4500 thumper.

If you have PCI-X, go Marvell. PCIe? Go LSI, but stick to the JBOD-capable not-so-RAID HBAs. Don't just trust me, throw a $100 or two at these and try it yourself. You'll see a better investment than $800 at the larger RAID cards. I went the latter route and have paid dearly (Adaptec, LSI, you name it). What worked from the beginning and is working today are the Marvell cards here, and I've been playing with new systems that use the LSI 3442ER.

15 comments:

Andy said...
This comment has been removed by the author.
NetSyphon said...

bad hyperlink.. but a card that also works well with sun kernel is the LSI3801E

its 8 port sff8087 (hmm similar to suns 550$ sas hba).

It's good for the external JBOD's like the ones AIC (pricey but top notch backplane) and supermicro makes.

also watch out for the sun j4400 :D

Dave said...

Don't know how relevant this is, but there are a couple of Silicon Images based PCI-Express SATA-II cards on Newegg. They use the Silicon Image SiI 3132 chipset, which I'm lead to believe is the PCI-Express version of the SI 3124 chipset. Granted they have a limited number of channels, but it is an option for a small NAS. I'm using one on a 4 disk, 3 TB ZFS NAS now and they work quite well. The latest Open Solaris builds picked them right up.

Kris said...

I've just ordered 2 x SAS3081E-R + 1 x SAS3442E-R + 1 x LSISAS3801E + 22 x Seagate Barracuda ES.2 to go with the 12 x ES1's I currently have. I'm going to run a test with 2 x 12 disk Intel SRC212MC2R's + 1 x Supermicro 4U 16 disk storage arrays, all striped together with RAIDZ1. I'll let you know how I go.

Mysidia said...

"..the real answer is SAS HBAs that connect to internal or external mixed SAS/SATA disk chassis."

Any recommendations for external chassis?

Who makes good cheap boxes I can plug these external controllers such as LSI SAS3442E into, some part numbers to search for would be very helpful..

jmlittle@gmail.com (Joe Little) said...

Well, previous commenters mentioned the sun J4400, and then there is DotHill and other vendors providing the same.

http://www.dothill.com/products/direct-attached-storage/2530.htm

http://www.sun.com/storage/disk_systems/expansion/4400/

http://www.enhance-tech.com/products/ultrastor/RS16_JS.html

That's just a bit of googling.

jmlittle@gmail.com (Joe Little) said...

The one I had in mind at the time was Promise's J300S, which can be had for $2K or so.

http://www.atacom.com/program/print_html_new.cgi?Item_code=ERCA_PROM_01_07

Chris Du said...

For PCI-X, LSI also has 1068 based cards. I'm running out on PCI-e slots on the motherboard so I'm using LSI 3800 HBA, it has 2 external connector and based on LSI 1068 SAS chip, I connect it to Supermicro 936E1 chassis, running very well.

jms said...

anyone knows if any of Dell, HP uses LSI SAS?

jms said...

hi anyone knows if Dell, HP use LSI SAS? or any other brand? looking for opensloaris support. thanks in adv. I have seen HP P410 uses PMC Sierra but not sure about JBOD/Opensolaris support. please comment.

jmlittle@gmail.com (Joe Little) said...

The Dell line of PERC RAID controllers (or whatever they call them these days) are re-branded LSI MegaRAID controllers. Sometimes the firmware differs. Check the OpenSolaris or Solaris 10 HCL for more details. HP uses SmartArray controllers, and I'm unsure if that works. In almost all cases, you want JBOD SAS controllers, not hardware RAID.

jms said...

Dell PERC 5i, 6i have been replaced by H200, H700 etc not sure what chipsets they use.

jms said...

i have tested Nexenta 3.0.3 over HP P410i controller, it does not support JBOD? "could not find any drives".....
why should HP, Dell ignore Solaris compatibility at a time when storage is booming?

Keith Waldron said...

AFAIK - the HP P410i cards can work as JBOD by setting up each single drive as a RAID 0. This is from HP phone support, I have yet to test with Nexenta.

jmlittle@gmail.com (Joe Little) said...

Anything that requires hardware RAID cards to be set to RAID0 per drive is less than useless, as you generally need to shut down and go to the BIOS to replace a drive (marking the new drive as also a RAID0 array). I'd hate to take a NAS down for drive maintenance.

Followers