Thursday, September 3, 2015

RAID Explained



RAID - Wikipedia, the free encyclopedia
RAID (redundant array of independent disks) is a data storage virtualization technology that combines multiple physical disk drive components into a single logical unit for the purposes of data redundancy, performance improvement, or both.[1]
Data is distributed across the drives in one of several ways, referred to as RAID levels, depending on the required level of redundancy and performance. The different schemas, or data distribution layouts, are named by the word RAID followed by a number, for example RAID 0 or RAID 1. Each schema, or a RAID level, provides a different balance among the key goals: reliability, availability, performance, and capacity. RAID levels greater than RAID 0 provide protection against unrecoverable sector read errors, as well as against failures of whole physical drives.

RAID 0 is used to boost a server's performance. It's also known as "disk striping." With RAID 0, data is written across multiple disks. This means the work that the computer is doing is handled by multiple disks rather than just one, increasing performance because multiple drives are reading and writing data, improving disk I/O. 
A minimum of two disks is required. Both software and hardware RAID support RAID 0, as do most controllers. 
The downside is that there is no fault tolerance. If one disk fails, then that affects the entire array and the chances for data loss or corruption increases.
http://www.thegeekstuff.com/2010/08/raid-levels-tutorial/
  • Minimum 2 disks.
  • Excellent performance ( as blocks are striped ).
  • No redundancy ( no mirror, no parity ).
  • Don’t use this for any critical system.
RAID 1 is a fault-tolerance configuration known as "disk mirroring." With RAID 1, data is copied seamlessly and simultaneously, from one disk to another, creating a replica, or mirror. If one disk gets fried, the other can keep working. It's the simplest way to implement fault tolerance and it's relatively low cost.
The downside is that RAID 1 causes a slight drag on performance. RAID 1 can be implemented through either software or hardware. A minimum of two disks is required for RAID 1 hardware implementations. With software RAID 1, instead of two physical disks, data can be mirrored between volumes on a single disk. 
One additional point to remember is that RAID 1 cuts total disk capacity in half: If a server with two 1TB drives is configured with RAID 1, then total storage capacity will be 1TB not 2TB.
  • Minimum 2 disks.
  • Good performance ( no striping. no parity ).
  • Excellent redundancy ( as blocks are mirrored ).
Software RAID 1 solutions do not always allow a hot swap of a failed drive. That means the failed drive can only be replaced after powering down the computer it is attached to. For servers that are used simultaneously by many people, this may not be acceptable. Such systems typically use hardware controllers that do support hot swapping.
RAID Level 5 (Stripe with parity)
• RAID 5 is by far the most common RAID configuration for business servers and enterprise NAS devices. 
This RAID level provides better performance than mirroring as well as fault tolerance. With RAID 5, data and parity (which is additional data used for recovery) are striped across three or more disks

If a disk gets an error or starts to fail, data is recreated from this distributed data and parity block— seamlessly and automatically. 

Essentially, the system is still operational even when one disk kicks the bucket and until you can replace the failed drive. Another benefit of RAID 5 is that it allows many NAS and server drives to be "hot-swappable" meaning in case a drive in the array fails, that drive can be swapped with a new drive without shutting down the server or NAS and without having to interrupt users who may be accessing the server or NAS. It's a great solution for fault tolerance because as drives fail (and they eventually will), the data can be rebuilt to new disks as failing disks are replaced. The downside to RAID 5 is the performance hit to servers that perform a lot of write operations.
  • Minimum 3 disks.
  • Good performance ( as blocks are striped ).
  • Good redundancy ( distributed parity ).
  • Best cost effective option providing both performance and redundancy. Use this for DB that is heavily read oriented. Write operations will be slow.
RAID5 fits as large, reliable, relatively cheap storage.
RAID5 writes data blocks evenly to all the disks, in a pattern similar to RAID0. However, one additional "parity" block is written in each row. This additional parity, derived from all the data blocks in the row, provides redundancy. If one of the drives fails and thus one block in the row is unreadable, the contents of this block can be reconstructed using parity data together with all the remaining data blocks.
If all drives are OK, read requests are distributed evenly across drives, providing read speed similar to that of RAID0. For N disks in the array, RAID0 provides N times faster reads and RAID5 provides (N-1) times faster reads. If one of the drives has failed, the read speed degrades to that of a single drive, because all blocks in a row are required to serve the request.
Write speed of a RAID5 is limited by the parity updates. For each written block, its corresponding parity block has to be read, updated, and then written back. Thus, there is no significant write speed improvement on RAID5, if any at all.
The capacity of one member drive is used to maintain fault tolerance. E.g. if you have 10 drives 1TB each, the resulting RAID5 capacity would be 9TB.
If RAID5 controller fails, you can still recover data from the array with RAID 5 recovery software. Unlike RAID0, RAID5 is redundant and it can survive one member disk failure.
http://www.prepressure.com/library/technology/raid
  • If a drive fails, you still have access to all data, even while the failed drive is being replaced and the storage controller rebuilds the data on the new drive.
  • This is complex technology. If one of the disks in an array using 4TB disks fails and is replaced, restoring the data (the rebuild time) may take a day or longer, depending on the load on the array and the speed of the controller. If another disk goes bad during that time, data are lost forever.
RAID Level 6 (Stripe with dual parity)
RAID6 is a large, highly reliable, relatively expensive storage.
RAID6 uses a block pattern similar to RAID5, but utilizes two different parity functions to derive two different parity blocks per row. If one of the drives fails, its contents are reconstructed using one set of parity data. If another drive fails before the array is recovered, the contents of the two missing drives are reconstructed by combining the remaining data and two sets of parity.
Read speed of the N-disk RAID6 is (N-2) times faster than the speed of a single drive, similar to RAID levels 0 and 5. If one or two drives fail in RAID6, the read speed degrades significantly because a reconstruction of missing blocks requires an entire row to be read.
There is no significant write speed improvement in RAID6 layout. RAID6 parity updates require even more processing than that in RAID5.
The capacity of two member drives is used to maintain fault tolerance. For an array of 10 drives 1TB each, the resulting RAID6 capacity would be 8TB.
The recovery of the RAID6 from a controller failure is fairly complicated.

if a drive in a RAID 5 systems dies and is replaced by a new drive, it takes hours to rebuild the swapped drive. If another drive dies during that time, you still lose all of your data. With RAID 6, the RAID array will even survive that second failure.

RAID 5 is a good all-round system that combines efficient storage with excellent security and decent performance. It is ideal for file and application servers that have a limited number of data drives.

RAID 6 is a good all-round system that combines efficient storage with excellent security and decent performance. It is preferable over RAID 5 in file and application servers that use many large drives for data storage.
Facebook uses raid 6 for its photo system.

RAID Level 10 (Mirror over stripes) combining RAID 1 & RAID 0
RAID10 is a large, fast, reliable, but expensive storage.
RAID10 uses two identical RAID0 arrays to hold two identical copies of the content.
Read speed of the N-drive RAID10 array is N times faster than that of a single drive. Each drive can read its block of data independently, same as in RAID0 of N disks.

Writes are two times slower than reads, because both copies have to be updated. As far as writes are concerned, RAID10 of N disks is the same as RAID0 of N/2 disks.
Half the array capacity is used to maintain fault tolerance. In RAID10, the overhead increases with the number of disks, contrary to RAID levels 5 and 6, where the overhead is the same for any number of disks. This makes RAID10 the most expensive RAID type when scaled to large capacity.
If there is a controller failure in a RAID10, any subset of the drives forming a complete RAID0 can be recovered in the same way the RAID0 is recovered.
Similarly to RAID 5, several variations of the layout are possible in implementation. For more diagrams, refer here.

It combines the mirroring of RAID 1 with the striping of RAID 0. It's the RAID level that gives the best performance, but it is also costly, requiring twice as many disks as other RAID levels, for a minimum of four. This is the RAID level ideal for highly utilized database servers or any server that's performing many write operations. 
  • If something goes wrong with one of the disks in a RAID 10 configuration, the rebuild time is very fast since all that is needed is copying all the data from the surviving mirror to a new drive. This can take as little as 30 minutes for drives of  1 TB.
  • Half of the storage capacity goes to mirroring, so compared to large RAID 5  or RAID 6 arrays, this is an expensive way to have redundancy.

For most home users, RAID 5 may be overkill, but RAID 1 mirroring provides decent fault tolerance.

Other RAID Levels There are other RAID levels: 2, 3, 4, 7, 0+1...but they are really variants of the main RAID configurations already mentioned, and they're used for specific cases. Here are some short descriptions of each:
•RAID 2 is similar to RAID 5, but instead of disk striping using parity, striping occurs at the bit-level. RAID 2 is seldom deployed because costs to implement are usually prohibitive (a typical setup requires 10 disks) and gives poor performance with some disk I/O operations.
•RAID 3 is also similar to RAID 5, except this solution requires a dedicated parity drive. RAID 3 is seldom used except in the most specialized database or processing environments, which can benefit from it.
•RAID 4 is a configuration in which disk striping happens at the byte level, rather than at the bit-level as in RAID 3.
•RAID 7 is a proprietary level of RAID owned by the now-defunct Storage Computer Corporation.
•RAID 0+1 is often interchanged for RAID 10 (which is RAID 1+0), but the two are not same. RAID 0+1 is a mirrored array with segments that are RAID 0 arrays. It's implemented in specific infrastructures requiring high performance but not a high level of scalability.

RAID is no substitute for back-up!

Read full article from RAID - Wikipedia, the free encyclopedia

Labels

Review (572) System Design (334) System Design - Review (198) Java (189) Coding (75) Interview-System Design (65) Interview (63) Book Notes (59) Coding - Review (59) to-do (45) Linux (43) Knowledge (39) Interview-Java (35) Knowledge - Review (32) Database (31) Design Patterns (31) Big Data (29) Product Architecture (28) MultiThread (27) Soft Skills (27) Concurrency (26) Cracking Code Interview (26) Miscs (25) Distributed (24) OOD Design (24) Google (23) Career (22) Interview - Review (21) Java - Code (21) Operating System (21) Interview Q&A (20) System Design - Practice (20) Tips (19) Algorithm (17) Company - Facebook (17) Security (17) How to Ace Interview (16) Brain Teaser (14) Linux - Shell (14) Redis (14) Testing (14) Tools (14) Code Quality (13) Search (13) Spark (13) Spring (13) Company - LinkedIn (12) How to (12) Interview-Database (12) Interview-Operating System (12) Solr (12) Architecture Principles (11) Resource (10) Amazon (9) Cache (9) Git (9) Interview - MultiThread (9) Scalability (9) Trouble Shooting (9) Web Dev (9) Architecture Model (8) Better Programmer (8) Cassandra (8) Company - Uber (8) Java67 (8) Math (8) OO Design principles (8) SOLID (8) Design (7) Interview Corner (7) JVM (7) Java Basics (7) Kafka (7) Mac (7) Machine Learning (7) NoSQL (7) C++ (6) Chrome (6) File System (6) Highscalability (6) How to Better (6) Network (6) Restful (6) CareerCup (5) Code Review (5) Hash (5) How to Interview (5) JDK Source Code (5) JavaScript (5) Leetcode (5) Must Known (5) Python (5)

Popular Posts