General Knowledge of RAID

1. What is RAID?
RAID is an acronym for Redundant Array of Inexpensive Disks (as opposed to SLED – Single Large Expensive Disk). Today, most drives are relatively inexpensive and the meaning of the ‘i’ is changing into ‘independent’. The purpose of RAID is to use 2 or more drives together in order to obtain increased performance and/or data security.

2. What types of RAID exists? and how do they differ?
The different types of RAID is typically referred to as ‘levels’. This FAQ will focus on level 0, 1 and 0+1 since these are what is most often supported by embedded RAID controllers.

Level 0: Striping
Level 0 provides increased performance by writing alternating blocks of data (referred to as the stripe size) to 2 or more drives simultaneously. Read performance is also improved since data is read from all drives at the same time. No redundant information is stored and failure of a SINGLE drive will cause all data to be lost. The number of drives in a level 0 array is sometimes also referred to as the stripe width.

Disadvantages: Not a “True” RAID because it is NOT fault-tolerant; The failure of just one drive will result in all data in an array being lost; Should never be used in mission critical environments.

Level 1: Mirroring
Level 1 provides redundancy by writing all data to 2 or more drives. Level 1 provides no increase in write performance (it may even be a bit slower). Read performance tend to be faster than a single drive, but not as fast as level 0. Level 1 provides excellent data security since ALL drives has to fail before any data is lost.

Disadvantages: Highest disk overhand of all RAID types (100%) – inefficient; Typically the RAID function is done by system software, loading the CPU/Server and possibly degrading throughput at high activity levels. Hardware implementation is strongly recommended; May not support hot swap of failed disk when implemented in “software”

Level 2: Bit interleaving with Hamming
Data is striped across multiple disks at the BIT level. Dedicated drives are used for Hamming error correction. Hamming error correction is a forward error correction code capable of correcting any single bit error or detecting any double bit error within the code word.

Disadvantages: Very high ratio of ECC disks to data disks with smaller word sizes – inefficient; Entry level cost very high – requires very high transfer rate requirement to justify; Transaction rate is equal to that of a single disk at best(with spindle synchronization); No commercial implementations exist/not commercially viable.

Level 3: Striping with parity
Data is striped across 2 or more disks and parity is written to a dedicated drive. Level 3 is typically implemented at the BYTE level.

Disadvantages: Transaction rate equal to that of a single disk drive at best (if spindles are synchronized); Controller design is fairly complex; Very difficult and resource intensive to do as a “software” RAID.

Level 4: Striping with parity
Data is striped across 2 or more disks and parity is written to a dedicated drive. Level 4 is typically implemented at the block (stripe size) level.

Disadvantages: Quite complex controller design; Worst Write transaction rate and Write aggregate transfer rate; Difficult and inefficient data rebuild in the event of disk failure; Block Read transfer rate equal to that of a single disk

Level 5: Striping with distributed parity
Data and parity is striped across 3 or more drives. Parity is distributed to each drive. Level 5 is the most widely used RAID for servers and other high-performance storage solutions. Any single drive can fail without data loss, ie. at least two drives must fail before any data is lost.

Disadvantages: Disk failure has a medium impact on throughput; Most complex controller design; Difficult to rebuild in the event of a disk failure (as compared to RAID level 1); Individual block data transfer rate same as single disk.

Level 6: Striping with dual distributed parity.
Essentially the same as level 5, but two sets of parity is calculated in order to improve data security.

Disadvantages: More complex controller design; Controller overhead to compute parity addresses is extremely high; Write performance can be brought on par with RAID Level 5 by using a custom ASIC for computing Reed – Solomon parity; Requires N +2 drives to implement because of dual parity scheme.

Level X+Y
It is possible to combine various RAID levels to optimise data security and/or performance. E.g. Level 0+1 and 1+0 as explained below.

Level 0+1: Striping and Mirroring
Level 0+1 combines level 0 and level 1 by mirroring a striped volume. Level 0+1 provides read and write performance very close (or equal) to level 0. Level 0+1 should not be confused with level 1+0. If there is 1 mirror set, a single drive failure will cause the whole array to become, in essence, a level 0 array. Level 0+1 requires an even number of drives and minimum 4.

Disadvantages: RAID 0+1 is NOT to be confused with RAID 10. A single drive failure will cause the whole array to become, in essence, a RAID level 0 array; Very expensive/ High overhead; All drives must move in parallel to proper track lowering sustained performance; Very limited scalability at very high inherent cost.

Level 1+0: Striping and Mirroring
Level 1+0 (sometimes referred to as level 10) combines level 0 and level 1 by striping a mirrored volume. Level 1+0 has better data security than level 0+1. The reason for this is that the level 1+0 controller can take advantage of a partial mirror set, but the level 0+1 controller cannot take advantage of a partial stripe set.

Disadvantages: Very expensive / High overhead; All drives must move in parallel to proper track lowering sustained performance; Very limited scalability at very high inherent cost

JBOD: Just a Bunch Of Drives
Not actually RAID, but some RAID controllers support this. In JBOD, 2 or more drives, which can be of any size, are put together so it appear as a single drive whose capacity is the sum of the individual drives. Since JBOD provides no performance increase and reduced data security, it is seldomly used.

3.  Can I use different sized/typed disks for my array?
Yes, but for all levels (except JBOD) you will loose some capacity on the largest drives.
For level 0, total capacity is equal to the stripe width times the smallest drive.
For level 1, total capacity is equal to the smallest drive.
For level 0+1, total capacity is equal to the stripe width times the smallest drive.
For level 5, total capacity is equal to the number of drives “minus 1” times the smallest drive.

4. Can I change my array after I have put data on it?
The stripe size or stripe width of a level 0 or level 0+1 array can not be changed without rebuilding the array. This will cause all data to be lost. For level 1 and level 0+1 additional mirror drives can add to provide additional data security. This will not cause any data to be lost.

5. Software RAID vs. hardware RAID. Which is better?
For the most part deffinitely hardware RAID. However, software RAID has some few advantages, but its beyond the scope of this FAQ to discuss this further.

6. I want to setup a level 0 RAID. Which stripe and cluster size should I use?
It depends on what the array is going to be used for. In general if the array is used for very large files (Video streaming etc.) a larger stripe size is better. For mainstream usage (office, gaming etc.) a stripe and cluster size in the 8-32 kB range is a common choice. To some extend the optimum stripe and cluster size combination also depends on the RAID controller and drives.

7. How do I setup/partition a level 0 RAID array, and install my OS on it?
The easy way:
1) Attach the drives to the RAID controller. Each drive should be master on its own channel (separate cable) for maximum performance.
2) Enter the RAID controller bios (usually you press CTRL+H after powering on the PC). Setup the RAID0 array with your preferred stripe size. The exact way of doing this depends on the controller. Note: Some controllers (e.g. the Promise-lite) do not allow you to change the stripe size.
3) Make sure you have a floppy with the RAID drivers. Boot from the OS installation CD, and when prompted press ‘F6’ to install third party RAID or SCSI drivers. Insert the floppy.
4) Using the installation program partition and format the drive.
5) Proceed with installing the OS on the boot partition.

The problem with the above method is that you can not specify the wanted cluster size when formatting (For NTFS the default cluster size is 4kB). If you choose to use NTFS it is not possible to change the cluster size without reformatting the drive. For FAT32, the cluster size can be changed at a later time with programs like Partition Magic.

If you want to use NTFS, or do some benchmarks with different stripe and cluster size combinations the recommended method requires a third temporary drive:

1) Attach the drives to the RAID controller. Each drive should be master on its own channel (separate cable) for maximum performance.
2) Attach the temporary drive to the normal IDE controller.
3) Enter the RAID controller bios. Setup the RAID0 array with your preferred stripe size.
4) Install the OS on the temporary drive.
5) Boot on the temporary drive. When the OS is up and running, install the RAID drivers.
6) Partition and format the RAID array with the preferred cluster size. In Windows XP, Disk Management provides the means to partition drives and formatting with a custom cluster size.
7) Optionally perform benchmarks on the array. Reformat the drive with a different cluster size or rebuild the array with a different stripe size. When the array is partitioned and formatted, the temporary drive can be removed.
8 ) Make sure you have a floppy with the RAID drivers. Boot from the OS installation CD, and when prompted press ‘F6’ to install third party RAID or SCSI drivers. Insert the floppy.
9) Install the OS on the boot partition of the RAID array. Make sure you do not format the array during installation, since this will reset the cluster size to the default value.

8. Get more details about Raid, please refer to “RAID Array & Server Glossary” as below: