Recommended Disk Controllers for ZFS
Since I've been using OpenSolaris and ZFS (via NexentaStor, plug plug) extensively, I get a lot of emails asking about what hardware works best. There have been various postings on the opensolaris and zfs lists to the same effect. A lot of people reference the OpenSolaris HCL lists which leave the average user scratching their head with more questions than answers. More to the point, the HCL doesn't tend to answer the more direct question of what hardware should I get to build a ZFS box, NAS, etc. Its important to note that in the case of ZFS, all that extra checksum, fault management, and performance goodness can be negated by selecting a "supported" hardware RAID card. Worse yet, many RAID cards are not fully interchangeable on the spot. What do you want for ZFS?
First, pick any 64-bit dual core or better motherboard or processor. If you can get ICH6+, nvidia, or Si3124-based on board SATA, then you are in good shape for your basic ZFS box with on-board SATA for your system disks alone. System disk can tend to be low 5400RPM 2.5 inch SATA-I drives. Many people then desire some large memory, battery-backed RAID card, and my tests with the high end LSI SAS cards show that memory on the RAID card doesn't do you as much good as having a recipe of lots of system RAM, a sufficient number of cores, many disk drives for the spindles, and sufficient use of the PCIX/PCIe bus using JBOD only disk controllers. I'll cover the controllers next, but I'd recommend at this point 4GB of RAM minimum, dual core at greater than 2ghz, and for any good load, at least two PCI-X or multi-lane PCIe card.
Disk controllers are where the real questions are asked. Over multiples iterations, heavy use, and some anecdotal evidence, we are down to some sweet spots. For PCI-X, there is one game in town, the Marvell-based AOC-SATA2-MV8, used in the X4500. At $100 for 8 JBOD SATA-II ports, it just works and is fault managed. Stick just SATA-II disks on these, and keep any SATA-I disks on the motherboard SATA ports for system disks. I'll add that various Si3124 based cards exist here, but not for sufficient port density.
SuperMicro AOC-SATA2-MV8 link
When it comes to PCIe, there isn't any good high port count options for SATA. If you need just 2 ports, or eSATA, there are various solutions based on the Si3124 chipset, and SIIG makes many of them for $50 each. However, in the PCIe world, the real answer is SAS HBAs that connect to internal or external mixed SAS/SATA disk chassis. Again, most SAS HBAs are either full fledged RAID without JBOD support, or simply don't work in the OpenSolaris ecosystem. 3ware is a lot cause here. The true winner for both cost and performance, while providing the JBOD you want, is the LSI SAS3442E-R.
CDW catalog link for LSI 3442ER
LSI 3442ER product page
Its $250, but I've seen it as low as $130. 8 channels, with both 2 internal ports (generally 8 drives are connected to a single SAS port) as well as the external port. You can use this with an external SAS-backed array of SATA drives from Promise, for instance, to easily populate 16 or 32 drives internally, with an additional 48 drives externally, just from the one card. Would I suggest that many on that single card? No, but you can. Loading up your system with 2 or 4 of these cards, which are based on the LSI 1068 chipset that is well supported by Sun is the best way forward for scale out performance. I was given some numbers of 200MB/sec writes and 400MB/sec reads on an example 12-drive system using RAIDZ. Good numbers, as I got 600MB/sec reads on a 48-drive X4500 thumper.
If you have PCI-X, go Marvell. PCIe? Go LSI, but stick to the JBOD-capable not-so-RAID HBAs. Don't just trust me, throw a $100 or two at these and try it yourself. You'll see a better investment than $800 at the larger RAID cards. I went the latter route and have paid dearly (Adaptec, LSI, you name it). What worked from the beginning and is working today are the Marvell cards here, and I've been playing with new systems that use the LSI 3442ER.
First, pick any 64-bit dual core or better motherboard or processor. If you can get ICH6+, nvidia, or Si3124-based on board SATA, then you are in good shape for your basic ZFS box with on-board SATA for your system disks alone. System disk can tend to be low 5400RPM 2.5 inch SATA-I drives. Many people then desire some large memory, battery-backed RAID card, and my tests with the high end LSI SAS cards show that memory on the RAID card doesn't do you as much good as having a recipe of lots of system RAM, a sufficient number of cores, many disk drives for the spindles, and sufficient use of the PCIX/PCIe bus using JBOD only disk controllers. I'll cover the controllers next, but I'd recommend at this point 4GB of RAM minimum, dual core at greater than 2ghz, and for any good load, at least two PCI-X or multi-lane PCIe card.
Disk controllers are where the real questions are asked. Over multiples iterations, heavy use, and some anecdotal evidence, we are down to some sweet spots. For PCI-X, there is one game in town, the Marvell-based AOC-SATA2-MV8, used in the X4500. At $100 for 8 JBOD SATA-II ports, it just works and is fault managed. Stick just SATA-II disks on these, and keep any SATA-I disks on the motherboard SATA ports for system disks. I'll add that various Si3124 based cards exist here, but not for sufficient port density.
SuperMicro AOC-SATA2-MV8 link
When it comes to PCIe, there isn't any good high port count options for SATA. If you need just 2 ports, or eSATA, there are various solutions based on the Si3124 chipset, and SIIG makes many of them for $50 each. However, in the PCIe world, the real answer is SAS HBAs that connect to internal or external mixed SAS/SATA disk chassis. Again, most SAS HBAs are either full fledged RAID without JBOD support, or simply don't work in the OpenSolaris ecosystem. 3ware is a lot cause here. The true winner for both cost and performance, while providing the JBOD you want, is the LSI SAS3442E-R.
CDW catalog link for LSI 3442ER
LSI 3442ER product page
Its $250, but I've seen it as low as $130. 8 channels, with both 2 internal ports (generally 8 drives are connected to a single SAS port) as well as the external port. You can use this with an external SAS-backed array of SATA drives from Promise, for instance, to easily populate 16 or 32 drives internally, with an additional 48 drives externally, just from the one card. Would I suggest that many on that single card? No, but you can. Loading up your system with 2 or 4 of these cards, which are based on the LSI 1068 chipset that is well supported by Sun is the best way forward for scale out performance. I was given some numbers of 200MB/sec writes and 400MB/sec reads on an example 12-drive system using RAIDZ. Good numbers, as I got 600MB/sec reads on a 48-drive X4500 thumper.
If you have PCI-X, go Marvell. PCIe? Go LSI, but stick to the JBOD-capable not-so-RAID HBAs. Don't just trust me, throw a $100 or two at these and try it yourself. You'll see a better investment than $800 at the larger RAID cards. I went the latter route and have paid dearly (Adaptec, LSI, you name it). What worked from the beginning and is working today are the Marvell cards here, and I've been playing with new systems that use the LSI 3442ER.
Labels: disk controller, nexenta, nexentastor, opensolaris, sas, SATA, zfs

4 Comments:
This post has been removed by the author.
bad hyperlink.. but a card that also works well with sun kernel is the LSI3801E
its 8 port sff8087 (hmm similar to suns 550$ sas hba).
It's good for the external JBOD's like the ones AIC (pricey but top notch backplane) and supermicro makes.
also watch out for the sun j4400 :D
Don't know how relevant this is, but there are a couple of Silicon Images based PCI-Express SATA-II cards on Newegg. They use the Silicon Image SiI 3132 chipset, which I'm lead to believe is the PCI-Express version of the SI 3124 chipset. Granted they have a limited number of channels, but it is an option for a small NAS. I'm using one on a 4 disk, 3 TB ZFS NAS now and they work quite well. The latest Open Solaris builds picked them right up.
I've just ordered 2 x SAS3081E-R + 1 x SAS3442E-R + 1 x LSISAS3801E + 22 x Seagate Barracuda ES.2 to go with the 12 x ES1's I currently have. I'm going to run a test with 2 x 12 disk Intel SRC212MC2R's + 1 x Supermicro 4U 16 disk storage arrays, all striped together with RAIDZ1. I'll let you know how I go.
Post a Comment
<< Home