Wednesday, March 17, 2010

WD Caviar Green drives and ZFS (UPDATED)

We are in the process of outfitting a new primary storage system, and I was of the mind to buy more WD Caviar Green drives, specially more of the 1.5TB WDEADS drives, as we had 4 new ones already that were tested behind a slower RAID card. Before buying more, I searched the usual suspects for pricing, and found the 1TB to 2TB versions of this drive are all priced very well, even for 5400RPM drives, but they now note on different sites and/or comments that they should not be used in RAID configurations. Hmm.

I did a little more research and saw this blog post depicting how one should avoid directly integrating these drives with ZFS. I got a couple, so I decided to put them in my server with an LSI-3442E SAS backplane and tested them. First, I tested my 500GB drives in a mirror set, and doing a "ptime dd if=/dev/zero of=test1G bs=4k count=250000" on the ZFS volume made up of those drives, I transferred 1GB in 3.63 seconds, or 282MB/sec. I then immediately tried the same on my mirror set of the WD drives, benefitting from caching of the first write. After 50+ minutes of waiting, I killed the write and saw that I transferred only 426MB, at a rate of 136KB/sec.

Yes, I can confirm that these drives are less than useless in a ZFS system (see update below), even as a simple two disk mirror set. Some basic iostat showed way too much "asvc_t" service time on the disks, running from 3.5 secs to 10 secs per write, where as the service times for the working 500GB drives were 0.7msec or the like. I had various errors mpt_handle_event_sync errors in my kernel logs, so perhaps there is some specific pathology between the SAS HBA, the SAS/SATA backplane, and these disks. However, we've proven this box works well with various drives. I'm going to try yet another 1.5TB drive, likely the previously maligned Seagate drives, since I've yet to have trouble with the latest firmware on those. My 4 WD drives will be placed in enclosures for external Time Machine backups in the near future. WD Caviar Green != Enterprise RAID drives.

UPDATE:

I'm leaving the above as is, but I think I have discovered perhaps a bad drive in the set, as when I employee 4 drives of this type I saw odd I/O patterns but ok performance in a straight RAID 0. However, I regularly have at least one drive with higher average service times, and trailing I/O writes as it catches up to the other drives. If I have these 4 drives in a pool (RAID 0), I got 193MB/sec writes, and 242MB/sec reads. Sticking them into a RAID10 (2 data, 2 mirror), I got a mirror 78MB/sec writes and 278MB/sec reads.

Splitting them off into two separate RAID1 data pools, I ran my tests and still saw high service times on the drives (only 65 or so, much better than the above, but still slow). Per mirror set performance was dismal, as I regularly got the 150MB/sec+ from a mirror of Caviar Black, but these drives got me just hit 31-34MB/sec (ie, half of the above RAID10). I guess with enough drives I'll get to better numbers in RAID10. In a RAIDZ1 (RAID5) grouping, it was 60MB/sec on the writes, and 172MB/sec on the reads.

So what accounts for the dismal performance I originally saw? I think it has to do with when multiple pools are active, and they are not all of this drive type. My original test had a Hitachi drive set as well as a WD Caviar Green drive set. Although my tests ran one at a time, I'm guessing there was some bad timing/driver issues and/or hardware issues when dealing with the mixed HD media.

A second, update conclusion is that you can use these drives, if only these drive types, in an array. RAID10 will get you sufficient performance, but otherwise you'll want to leave this to secondary storage. Future drive replacement scenarios are a real cause for concern.

9 comments:

微笑每一天 said...

thx u very much, i learn a lot

Paul said...

This makes no sense.

The drives in a mirrored set are supposed to be independent and (particularly if they both start off empty) should be able to stream mirrored data independently at line rate. Any ideas why they act like this?

jmlittle@gmail.com (Joe Little) said...

I only have suspicions, and its more to do with the how the ZFS write transaction works (write/read/write for checksum). The various "green" drive technologies play tricks here and there and I can only guess that its a combination of this hardware and ZFS that gives us this problem. Again, these are SATA drives on a SAS/SATA backplane, SAS HBA, etc. Perhaps these problems will go away with updated drivers/kernel (I'm at a B134 build, or so says Nexenta 3.0 Beta1).

However, I later tested putting in different drives, including some WD SATA-I 400GB RE2 drives along with the 500GB Hitachi and 1TB WD Caviar Black. All worked fine separately and together. Add a Caviar Green, and boom, major suckage.

Drew said...

Very odd. I tested an array of WD Caviar Green 2TB drives in a Promise M610i connected to a snv_111b host via iSCSI on aggregated dual gigE as a raid10 pool. The drives are exported individually so that ZFS can do the mirroring.

I'm getting about 180MB/sec.

So it must be something about the direct interaction between ZFS and the WD Caviar Greens.

I thought this was worth mentioning to point out that there are configurations where these drives work fine.

jmlittle@gmail.com (Joe Little) said...

I agree completely that there are ways that it works fine, and I suspect its the direct interaction in my case. I just don't like that the drive will require special handling, since in an emergency, you might not have the ability to say upgrade drive firmware or the like first. I had these 4 1.5TB Caviar Green drives in an iSCSI array behind an older LSI RAID card and it was fine, using ZFS via the iSCSI client.

I'm planning a retest with no other drives present, but again, its all a cautionary tale that you may not want to build out big, long term storage with these. WD evens says not for RAID. I'm targeting this system for around 50TB primary storage for thousands of clients. Any wiff of problems and I need to look elsewhere.

念強 said...
This comment has been removed by a blog administrator.
鄭明宏 said...
This comment has been removed by a blog administrator.
Matt Connolly said...

I added a WD Green to my ZFS mirror (snv_134) and performance is terrible. I'm testing with netatalk and a mac client over gig ethernet. If the WD green drive is a single ZFS drive its write performance is 10x better than if it is in a mirror with another SATA drive (Samsung in this case). Read performance in the mirror is fine. Could be a ZFS issue since the write performance is so drastically different in/out of the mirror. Or, as many people say: these drives are duds, I want my money back.

Eric Snellmsn said...

You are probably using 4k sector drives on older opensolaris version. Open indiana 147 might support these drives.

The issue is the drive emulates 512 blocks and every 512 block write requires a read of the 4k sector to get the missing 1548 and then the write.

Followers