Skip Navigation

How to fix my ZFS pool mistakes

Back when I was even less experienced in self-hosting I setup my media/backup server using a RAIDZ1 array and 3 x 8TB disks. It's been running well for a while and I haven't had any problems and no disk errors.

But today I read a post about 'pool design rules' stating that RAIDZ1 configurations should not have drives over 1TB because the chances of errors occurring during re-silvering are high. I wish I had known this sooner.

What can I do about this? I send ZFS snapshots to 2 single large (18TB) hardrives for cold backups, so I have the capacity to do a migration to a new pool layout. But which layout? The same article I referenced above says to not use RAIDZ2 or RAIDZ3 with any less than 6 drives...I don't want to buy 3 more drives. Do I buy an additional 8TB drive (for a total of 4 x 8TB) and stripe across two sets of mirrors? Does that make any sense?

Thank you!

11
11 comments
  • Honestly, if you’re doing regular backups and your ZFS system isn’t being used for business you’re probably fine. Yes, you are at increased risk of a second disk failure during resilver but even if that happens you’re just forced to use your backups, not complete destruction of the data.

    You can also mitigate the risk of disk failure during resilver somewhat by ensuring that your disks are of different ages. The increased risk comes somewhat from the fact that if you have all the same brand of disks that are all the same age and/or from the same batch/factory they’re likely to die from age around the same time, so when one disk fails others might be soon to follow, especially during the relatively intense process of resilvering.

    Otherwise, with the number of disks you have you’re likely better off just going with mirrors rather than RAIDZ at all. You’ll see increased performance, especially on write, and you’re not losing any space with a 3-way mirror versus a 3-disk RAIDZ2 array anyway.

    The ZFS pool design guidelines are very conservative, which is a good thing because data loss can be catastrophic, but those guidelines were developed with pools that are much larger than yours and for data in mind that is fundamentally irreplaceable, such as user generated data for a business versus a personal media server.

    Also, in general backups are more important than redundancy, so it’s good you’re doing that already. RAID is about maintaining uptime, data security is all about backups. Personally, I’d focus first on a solid 3-2-1 backup plan rather than worrying too much about trying to mitigate your current array suffering catastrophic failure.

  • I think it's worth pointing out that this article is 11 years old, so that 1TB rule-of-thumb probably probably needs to be adjusted for modern disks.

    If you have 2 full backups (18TB drives being more than sufficient) of the array, especially if one of those is offsite, then I'd say you're really not at a high enough risk of losing data during a rebuild to justify proactively rebuilding the array until you have at least 2 or more disks to add.

  • Those failure rates are nonsense based on theoretical limits. In practice: a few weeks ago I resilvered a 4 disk z1 array with 8tb disks at 85% capacity in less than 24 hours. I scrub the pool every month and it didn't seem any more taxing than a scrub.

  • It’s fine. RAID is not a backup. I’ve been running simple mirrors for many years and never lost data because I have multiple backups. Focus on offsite and resilient backups, not how many drives can fail in your primary storage device.

  • That article is from 2013, so I'm a bit skeptical about the claims about under 1 TB drives. It was probably reasonable advice back then when 1 TB capacities were sorta cutting edge. Now we have 20+ TB hard drives, nobody's gonna be making arrays of 750 GB drives.

    I have two 4TB drives in a simple mirror configuration and have resilvered it a few times due to oopsies and it's been fine, even with my shitty SATA ports.

    The main concern is bigger drives take longer to resilver because well, it's got much more data to shuffle around. So logically, if you have 3 drives that are the same age and have gotten the same amount of activity and usage, when one gives up it would be likely for the other 2 to be getting close as well. If you only have 1 drive of redundancy, then this can be bad because temporarily, you have no redundancy so one more drive failure and the zpool is gone. If you're concerned about them all failing at the same time, the best defense is either different drive brands, or different drive ages.

    But you do have backups, so, if that pool dies, it's not the end of the world. You can pull it back from your 18TB mirror array. And it's different drives, so those are unlikely to fail at the same time as your 3x4TB drives, let alone 2 more of them. You need 4 drives to give up in total in your particular case before your data is truly gone. That's not that bad.

    It's a risk management question. How much risk do you tolerate? How's your uptime requirements? For my use case, I deemed a simple 2 drive mirror to be sufficient for my needs, and I have a good offsite backup on a single USB external drive, and an encrypted cloud copy of things that are really critical and I can't possibly lose like my Keepass database.

    • Bit error rates have barely improved since then. So the probability of an error whenr reading a substantial fraction of a disk is now higher than it was in 2013.

      But as others have pointed out. RAID is not, and never was, a substitute for a backup. Its purpose is to increase availability. And if that is critical to your enterprise, these things need to be taken into account, and it may turn out that raidz1 with 8 TB disks is fine for your application, or it may not. For private use, I wouldn't fret. but make frequent backups.

      This article was not about total disk failure, but about the much more insidious undetected bit error.

  • Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I've seen in this thread:

    Fewer Letters More Letters
    RAID Redundant Array of Independent Disks for mass storage
    SATA Serial AT Attachment interface for mass storage
    ZFS Solaris/Linux filesystem focusing on data integrity

    3 acronyms in this thread; the most compressed thread commented on today has 30 acronyms.

    [Thread #630 for this sub, first seen 26th Mar 2024, 18:55] [FAQ] [Full list] [Contact] [Source code]

11 comments