Skip to content
Define The Cloud

The Intersection of Technology and Reality

Define The Cloud

The Intersection of Technology and Reality

Redundancy in Data Storage: Part 1: RAID Levels

Joe Onisick (@JoeOnisick), December 13, 2010February 21, 2011

I recently read Joe Onisick’s piece, “Have We Taken Data Redundancy Too Far?”  I think Joe raises a good point, and this is a natural topic to dissect in detail after my previous article about cloud disaster recovery and business continuity.  I, too, am concerned by the variety of data redundancy architectures used in enterprise deployments and the duplication of redundancy on top of redundancy that often results.  In a series of articles beginning here, I will focus on architectural specifics of how data is stored, the performance implications of different storage techniques, and likely consequences to data availability and risk of data loss.

The first technology that comes to mind for most people when thinking of data redundancy is RAID, which stands for Redundant Array of Independent Drives.  There are a number of different RAID technologies, but here I will discuss just a few.  The first is mirroring, or RAID-1, which is generally employed with pairs of drives.  Each drive in a RAID-1 set contains the exact same information.  Mirroring generally provides double the random access read performance of a single disk, while providing approximately the same sequential read performance and write performance.  The resulting disk capacity is the capacity of a single drive.  In other words, half the disk capacity is sacrificed.

RAID-1, or Mirroring
RAID-1, or Mirroring;
Courtesy Colin M. L. Burnett

A useful figure of merit for data redundancy architectures is MTTDL, or Mean Time To Data Loss, which can be calculated for a given storage technology using the underlying MTBF, Mean Time Between Failures, and MTTR, Mean Time To Repair/Restore redundancy.  All “mean time” metrics really specify an average rate over an operating lifetime; in other words, if the MTTDL of an architecture is 20 years, there is a 1/20 = approximately 5% chance in any given year of suffering data loss.  Similarly, MTBF specifies the rate of underlying failures.   MTTDL includes only failures in the storage architecture itself, and not the risk of a user or application corrupting data.

For a two-drive mirror set, the classical calculation is:

This is a common reason to have hot-spares in drive arrays; allowing an automatic rebuild significantly reduces MTTR, which would appear to also significantly increase MTTDL.  However…

While hard drive manufacturers claim very large MTBFs, studies such as this one have consistently found numbers closer to 100,000 hours.  If recovery/rebuilding the array takes 12 hours, the MTTDL would be very large, implying an annual risk of data loss of less than 1 in 95,000.  Things don’t work this well in the real world, for two primary reasons:

  • The optimistic assumption that the risk of drive failure for two drives in an array is uncorrelated.  Because disks in an array were likely sourced at the same time and have experienced similar loading, vibration, and temperature over their working life, they are more likely to fail at the same time.  Also, some failure modes have a risk of simultaneously eliminating both disks, such as a facility fire or a hardware failure in the enclosure or disk controller operating the disks.
  • It is also assumed that the repair will successfully restore redundancy if a further drive failure doesn’t occur.  Unfortunately, a mistake may happen if personnel are involved in the rebuild.  Also, the still-functioning drive is under heavy load during recovery and may experience an increased risk of failure.  But perhaps the most important factor is that as capacities have increased, the Unrecoverable Read Error rate, or URE, has become significant.  Even without a failure of the drive mechanism, drives will permanently lose blocks of data at this specified (very low) rate, which generally varies between 1 error per 1014 bits read for low-end SATA drives to 1 per 1016 for enterprise drives.  Assuming that the drives in the mirror are 2 TB low-end SATA drives, and there is no risk of a rebuild failure other than by unrecoverable read errors, the rebuild failure rate is 17%.
RAID 1+0: Mirroring and Striping
RAID 1+0: Mirroring and Striping;
Courtesy MovGP

With the latter in mind, the MTTDL becomes:

When the rebuild failure rate is very large compared to 1/MTBF:

In this case, MTTDL is approximately 587,000 hours, or a 1 in 67 risk of losing data per year.

RAID-1 can be extended to many drives with RAID-1+0, where data is striped across many mirrors.  In this case, capacity and often performance scales linearly with the number of stripes.  Unfortunately, so does failure rate.  When one moves to RAID-1+0, the MTTDL can be determined by dividing the above by the number of stripes.  A ten drive (five stripes of two-disk mirrors) RAID-1+0 set of the above drives would have a 15% chance of losing data in a year (again without considering correlation in failures.)  This is worse than the failure rate of a single drive.

RAID-5RAID-6
RAID-5 and RAID-6;
Courtesy Colin M. L. Burnett

Because of the amount of storage required for redundancy in RAID-1, it is typically only used for small arrays or applications where data availability and performance are critical.  RAID levels using parity are widely used to trade-off some performance for additional storage capacity.

RAID-5 stripes blocks across a number of disks in the array (minimum 3, but generally 4 or more), storing parity blocks that allow one drive to be lost without losing data.  RAID-6 works similarly (with more complicated parity math and more storage dedicated to redundancy) but allows up to two drives to be lost.  Generally, when a drive fails in a RAID-5 or RAID-6 environment, the entire array must be reread to restore redundancy (during this time, application performance usually suffers.)

While SAN vendors have attempted to improve performance for parity RAID environments, significant penalties remain.  Sequential writes can be very fast, but random writes generally entail reading neighboring information to recalculate parity.  This burden can be partially eased by remapping the storage/parity locations of data using indirection.

For RAID-5, the MTTDL is as follows:

Again, when the RFR is large compared to 1/MTBF, the rate of double complete drive failure can be ignored:

However, here RFR is much larger as it is calculated over the entire capacity of the array.  For example, achieving an equivalent capacity to the above ten-drive RAID-1+0 set would require 6 drives with RAID-5.  The RFR here would be over 80%, yielding little benefit from redundancy, and the array would have a 63% chance of failing in a year.

Properly calculating the RAID-6 MTTDL requires either Markov chains or very long series expansions, and there is significant difference in rebuild logic between vendors.  However, it can be estimated, when RFR is relatively large, and an unrecoverable read error causes the array to entirely abandon using that disk for rebuild, as:

Evaluating an equivalent, 7-drive RAID-6 array yields an MTTDL of approximately 100,000 hours, or a 1 in 11 chance of array loss per year.

The key things I note about RAID are:

  • The odds of data loss are improved, but not wonderful, even under favorable assumptions.
  • Achieving high MTTDL with RAID requires the use of enterprise drives (which have a lower unrecoverable error rate).
  • RAID only protects against independent failures.  Additional redundancy is needed to protect against correlated failures (a natural disaster, a cabinet or backplane failure, or significant covariance in disk failure rates.)
  • RAID only provides protection of the data written to the disk.  If the application, users, or administrators corrupt data, RAID mechanisms will happily preserve that corrupted data.  Therefore, additional redundancy mechanisms are required to protect against these scenarios.

Because of these factors, additional redundancy is required in conventional application deployments, which I will cover in subsequent articles in this series.

Images in this article created by MovGP (RAID-1+0, public domain) and Colin M. L. Burnett (all others, CC-SA) from Wikipedia.

This series is continued in Redundancy in Data Storage: Part 2: Geographical Replication.

About the Author

Michael Lyle (@MPLyle) is CTO and co-founder of Translattice, and is responsible for the company’s strategic technical direction.  He is a recognized leader in developing new technologies and has extensive experience in datacenter operations and distributed systems.

Share this:

  • Facebook
  • X

Related posts:

  1. Redundancy in Data Storage: Part 2: Geographical Replication
  2. Have We Taken Data Redundancy too Far?
  3. Data Center Bridging
  4. Storage Protocols
  5. Intel’s Betting the Storage I/O Farm on the CPU
Technical Deep Dive business continuitydisaster recoveryraidredundancy

Post navigation

Previous post
Next post

Related Posts

Server Networking With gen 2 UCS Hardware

October 22, 2011May 18, 2020

** this post has been slightly edited thanks to feedback from Sean McGee** In previous posts I’ve outlined: How UCS server failover occurs from a network perspective: http://www.definethecloud.net/ucs-server-failover How Inter-fabric traffic is handled in End-Host mode: http://www.definethecloud.net/inter-fabric-traffic-in-ucs How inter-fabric traffic is handled in switch mode: http://www.definethecloud.net/inter-fabric-traffic-in-ucspart-ii If you’re not familiar…

Share this:

  • Facebook
  • X
Read More

How to Boost Cloud Reliability

September 8, 2011May 18, 2020

Clouds fail. That’s a fact. But if your company uses business apps that are tied to the availability of public cloud services, you can—and must—take steps to mitigate these failures by getting schooled on a few key factors:  service-level agreements (SLAs), redundancy options, application design, and the type of service…

Share this:

  • Facebook
  • X
Read More

What’s the deal with Quantized Congestion Notification (QCN)

August 28, 2010November 21, 2012

For the last several months there has been a lot of chatter in the blogosphere and Twitter about FCoE and whether full scale deployment requires QCN.  There are two camps on this: FCoE does not require QCN for proper operation with scale. FCoE does require QCN for proper operation and…

Share this:

  • Facebook
  • X
Read More

Comments (7)

  1. Pingback: Tweets that mention Redundancy in Data Storage: Part 1: RAID Levels — Define The Cloud -- Topsy.com
  2. Pingback: Efficiency & Reliability in the Cloud | Entrepreneur Resources: Small Business Blog
  3. Michael Zandstra says:
    November 8, 2012 at 7:18 am

    Hi Michael,

    I was wondering if you could give a little more depth to this article concerning your formula’s. I’ve calculated some through, and I don’t get the same numbers as you do for RAID-5 and RAID-6. Since there is no explaination what you precisely did, I can’t figure out who made the error.

    Regards,

    Michael Zandstra

    Reply
  4. software Recovery data 10 raid says:
    September 7, 2014 at 11:33 pm

    Useful info. Fortunate me I found your website by chance,
    and I’m stunned why this coincidence did not
    came about earlier! I bookmarked it.

    Reply
  5. heello.com says:
    October 6, 2014 at 1:29 am

    Thanks for sharing your thoughts about seems to be the hardest word.
    Regards

    Reply
  6. agario hack says:
    June 26, 2016 at 12:26 am

    You should be cautious when looking around the net and seeking for hacks and cheats.

    Reply
  7. Home Decor says:
    December 31, 2016 at 10:05 pm

    Whateveг the reason is, onloine shopping іs apt for thе shoppers who
    have lеss time to spend оn shopping оr do not likе to
    ցo to the malls foor tɦe purpse оf shopping.
    Think homemade hummus, feta, rice wrapped Ñ–n grape leaves.
    Тhis suun Ñ–s dipping, it’s tÑ–me to get dressed, and
    experience nightlife tҺe ѡay Turks do.

    Reply

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Creative Commons License
This work by Joe Onisick and Define the Cloud, LLC is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License

Disclaimer

All brand and company names are used for identification purposes only. These pages are not sponsored or sanctioned by any of the companies mentioned; they are the sole work and property of the authors. While the author(s) may have professional connections to some of the companies mentioned, all opinions are that of the individuals and may differ from official positions of those companies. This is a personal blog of the author, and does not necessarily represent the opinions and positions of his employer or their partners.
©2025 Define The Cloud | WordPress Theme by SuperbThemes