comp.org.uk

Networking | Programming | Security | Linux | Computer Science | About

RAID - Redundant Array of Independent Disks

Redundant array of independent disks (RAID) is a storage technology that combines multiple hard disk drives into a single logical unit so that the data can be distributed across the hard disk drives for both improved performance and increased security according to their various RAID levels. How the data is distributed across the disks depends on both the redundancy and the performance requirements for the application, service, or dataset that is being delivered. The basic idea behind RAID is to combine multiple inexpensive disk drives into an array that displays as one large logical storage unit to the server.

There are two different options available when implementing RAID:

Software RAID is implemented on a server by using software that groups multiple logical disks into a single virtual disk. Most modern operating systems have built-in software that allows for the configuration of a software-based RAID array.

Hardware RAID controllers are physical cards that are added to a server to off-load the overhead of RAID and do not require any CPU resources; they allow an administrator to boot straight to the RAID controller to configure the RAID levels. Hardware RAID is the most common form of RAID due to its tighter integration with the device and better error handling.

There are various different RAID levels and each has its own particular use. Choosing the correct RAID level based upon what the application is being used for is critical to the performance of the application. These are the most common RAID levels in use today:

RAID 1

When drives are configured using RAID 1, they are said to be configured in a mirrored set. It is called “mirrored” because the data is exactly the same on both disks, as the drive creates a mirror image of disk 1 on disk 2 in the set. As you might expect, RAID 1 requires a minimum of two disks in order to establish a volume partition on a basic disk. Read requests sent to that volume can be serviced by either disk 1 or disk 2, and write requests will always update both disks. Each disk in a RAID 1 configuration contains a complete, identical copy of the data for the drive, and can be accessed independently. RAID 1 provides its data protection without a parity check, calculates data in two drives, and stores it on a separate drive.

RAID 1 is a particularly useful configuration when read performance and reliability are more important than storage capacity. A RAID 1 array can only be as big as the smallest disk. While RAID 1 can protect against the failure of a single hard drive, it does not protect against data corruption or file deletions since any changes would be instantly mirrored or copied to every drive in the array. In case of a disk controller failure or data corruption, an organization should still plan on implementing a proper backup strategy to complement the data protection already provided by the RAID 1 array configuration.

RAID 0

RAID 0 is a configuration that provides increased performance but has no redundancy built into it. This configuration requires a minimum of two disks. It “stripes” writes across both disks in the array to increase performance by getting access to multiple physical spindles, instead of just one, and splitting the data into blocks. Then it writes that data across all the drives in the array. If any of the drives fails, however, the entire array is irreparably damaged. RAID 0 offers low cost of implementation and is typically used for noncritical data that is regularly backed up and requires high write speed.

RAID 1+0

Raid 1+0 consists of a top-level RAID 0 array that is in turn composed of two or more RAID 1 arrays. It incorporates both the performance advantages of RAID 0 and the data protection advantages of RAID 1. Although its official designation is RAID 1+0, it is often referred to as RAID 10. If a single drive fails in a RAID 10 array, the lower-level mirrors will enter into a degraded mode while the top-level stripe can continue to perform as normal because both of its drives are still working as expected.

The drawback to RAID 10 is that it cuts your usable storage in half since everything is mirrored. It is also a very expensive configuration to implement. RAID 10 could be used if an application requires both high performance and reliability and the organization is willing to sacrifice capacity to get it. Some examples where this configuration might make sense are for enterprise servers, database servers, and high- end application servers.

RAID 0+1

RAID 0+1 arrays are made up of a top-level RAID 1 mirror containing two or more RAID 0 stripe sets. This configuration is similar to RAID 10, as it provides both the advantages of RAID 0 and RAID 1. A single drive failure in RAID 0+1 results in one of the lower-level stripes completely failing since RAID 0 is not a fault-tolerant configuration. However, the top-level mirror continues to operate as normal, so there is no interruption to data access. In the case of this type of failure, the drive must be replaced and the stripe set has to be rebuilt as an empty stripe set, after which the mirror is rebuilt on the empty stripe set; therefore, it has a longer recovery period than RAID 10.

Again, similar to RAID 10, the RAID 0+1 configuration is recommended for applications requiring both high performance and reliability that also have the ability to sacrifice capacity.

RAID 5

RAID 5 is one of the most commonly used RAID implementations, as it provides a good balance of data protection, performance, and cost-effectiveness. A RAID 5 array uses block-level striping for a performance enhancement with distributed parity for data protection. A RAID 5 array distributes parity and the data across all drives and requires that all drives but one be present in order to operate. This means that a RAID 5 array is not destroyed by a single drive failure, regardless of which drive is lost. When a drive fails, the RAID 5 array is still accessible to read and write data. After the failed drive has been replaced, the array enters into data recovery mode, which means that the parity data in the array is used to rebuild the missing data from the failed drive back onto the new hard drive.

RAID 5 uses the equivalent of one hard disk to store the parity, which means you “lose” or sacrifice the storage space equivalent to one of the drives that is part of the array. A RAID 5 array requires a minimum of three disks and provides good performance and redundancy at a low cost. RAID 5 delivers the ideal combination of good performance, fault tolerance, high capacity, and storage efficiency. RAID 5 is best suited for transaction processing, for example, a database application. It is great for storing large files where data is read sequentially.

RAID 6

RAID 6 can be viewed essentially as an extension of RAID 5, as it uses the same striping and parity block distribution across all the drives in the array. The difference is that RAID 6 adds an additional parity block, allowing it to use block-level striping with two parity blocks distributed across all the disks. The inclusion of this second parity block allows the array to tolerate the loss of two hard disks instead of the one failure that RAID 5 can tolerate. RAID 6 causes no performance hit on read operations but does have a lower performance rate on write operations due to the overhead associated with the parity calculations.

RAID 6 is ideal for supporting applications where additional fault tolerance that is not achievable with RAID 5 is required. The additional fault tolerance supplied by the second parity block in RAID 6 makes it a good candidate for deployment in environments where IT support is not readily available or spare parts may take a significant amount of time to be delivered on-site.

Table to summarise different RAID levels

Level Description Min. No
of Disks
Fault Tolerance Storage Efficiency
RAID 1 Blocks are mirrorrer. No striping or parity 2 1 Drive 50% or n/2
RAID 0 Blocks are striped. No mirror or parity 2 None 100%
RAID 1+0 Blocks are mirrored and striped 4 1 drive per span up to max of 2 50%
RAID 0+1 Blocks are striped across 2 disks and mirrored on the third disk 4 1 drive per span up to max of 2 50%
RAID 5 Blocks are striped. Distributed parity 3 1 drive Number of drives - 1
RAID 6 Blocks are striped with double distributed parity 4 2 drivea Number of drives - 2

Published on Sun 14 October 2012 by Manny Larson in Computer Science with tag(s): raid