General Knowledge of RAID

1. What is RAID?
RAID is an acronym for Redundant Array of Inexpensive Disks (as opposed to SLED – Single Large Expensive Disk). Today, most drives are relatively inexpensive and the meaning of the ‘i’ is changing into ‘independent’. The purpose of RAID is to use 2 or more drives together in order to obtain increased performance and/or data security.

2. What types of RAID exists? and how do they differ?
The different types of RAID is typically referred to as ‘levels’. This FAQ will focus on level 0, 1 and 0+1 since these are what is most often supported by embedded RAID controllers.

Level 0: Striping
Level 0 provides increased performance by writing alternating blocks of data (referred to as the stripe size) to 2 or more drives simultaneously. Read performance is also improved since data is read from all drives at the same time. No redundant information is stored and failure of a SINGLE drive will cause all data to be lost. The number of drives in a level 0 array is sometimes also referred to as the stripe width.

Disadvantages: Not a “True” RAID because it is NOT fault-tolerant; The failure of just one drive will result in all data in an array being lost; Should never be used in mission critical environments.

Level 1: Mirroring
Level 1 provides redundancy by writing all data to 2 or more drives. Level 1 provides no increase in write performance (it may even be a bit slower). Read performance tend to be faster than a single drive, but not as fast as level 0. Level 1 provides excellent data security since ALL drives has to fail before any data is lost.

Disadvantages: Highest disk overhand of all RAID types (100%) – inefficient; Typically the RAID function is done by system software, loading the CPU/Server and possibly degrading throughput at high activity levels. Hardware implementation is strongly recommended; May not support hot swap of failed disk when implemented in “software”

Level 2: Bit interleaving with Hamming
Data is striped across multiple disks at the BIT level. Dedicated drives are used for Hamming error correction. Hamming error correction is a forward error correction code capable of correcting any single bit error or detecting any double bit error within the code word.

Disadvantages: Very high ratio of ECC disks to data disks with smaller word sizes – inefficient; Entry level cost very high – requires very high transfer rate requirement to justify; Transaction rate is equal to that of a single disk at best(with spindle synchronization); No commercial implementations exist/not commercially viable.

Level 3: Striping with parity
Data is striped across 2 or more disks and parity is written to a dedicated drive. Level 3 is typically implemented at the BYTE level.

Disadvantages: Transaction rate equal to that of a single disk drive at best (if spindles are synchronized); Controller design is fairly complex; Very difficult and resource intensive to do as a “software” RAID.

Level 4: Striping with parity
Data is striped across 2 or more disks and parity is written to a dedicated drive. Level 4 is typically implemented at the block (stripe size) level.

Disadvantages: Quite complex controller design; Worst Write transaction rate and Write aggregate transfer rate; Difficult and inefficient data rebuild in the event of disk failure; Block Read transfer rate equal to that of a single disk

Level 5: Striping with distributed parity
Data and parity is striped across 3 or more drives. Parity is distributed to each drive. Level 5 is the most widely used RAID for servers and other high-performance storage solutions. Any single drive can fail without data loss, ie. at least two drives must fail before any data is lost.

Disadvantages: Disk failure has a medium impact on throughput; Most complex controller design; Difficult to rebuild in the event of a disk failure (as compared to RAID level 1); Individual block data transfer rate same as single disk.

Level 6: Striping with dual distributed parity.
Essentially the same as level 5, but two sets of parity is calculated in order to improve data security.

Disadvantages: More complex controller design; Controller overhead to compute parity addresses is extremely high; Write performance can be brought on par with RAID Level 5 by using a custom ASIC for computing Reed – Solomon parity; Requires N +2 drives to implement because of dual parity scheme.

Level X+Y
It is possible to combine various RAID levels to optimise data security and/or performance. E.g. Level 0+1 and 1+0 as explained below.

Level 0+1: Striping and Mirroring
Level 0+1 combines level 0 and level 1 by mirroring a striped volume. Level 0+1 provides read and write performance very close (or equal) to level 0. Level 0+1 should not be confused with level 1+0. If there is 1 mirror set, a single drive failure will cause the whole array to become, in essence, a level 0 array. Level 0+1 requires an even number of drives and minimum 4.

Disadvantages: RAID 0+1 is NOT to be confused with RAID 10. A single drive failure will cause the whole array to become, in essence, a RAID level 0 array; Very expensive/ High overhead; All drives must move in parallel to proper track lowering sustained performance; Very limited scalability at very high inherent cost.

Level 1+0: Striping and Mirroring
Level 1+0 (sometimes referred to as level 10) combines level 0 and level 1 by striping a mirrored volume. Level 1+0 has better data security than level 0+1. The reason for this is that the level 1+0 controller can take advantage of a partial mirror set, but the level 0+1 controller cannot take advantage of a partial stripe set.

Disadvantages: Very expensive / High overhead; All drives must move in parallel to proper track lowering sustained performance; Very limited scalability at very high inherent cost

JBOD: Just a Bunch Of Drives
Not actually RAID, but some RAID controllers support this. In JBOD, 2 or more drives, which can be of any size, are put together so it appear as a single drive whose capacity is the sum of the individual drives. Since JBOD provides no performance increase and reduced data security, it is seldomly used.

3.  Can I use different sized/typed disks for my array?
Yes, but for all levels (except JBOD) you will loose some capacity on the largest drives.
For level 0, total capacity is equal to the stripe width times the smallest drive.
For level 1, total capacity is equal to the smallest drive.
For level 0+1, total capacity is equal to the stripe width times the smallest drive.
For level 5, total capacity is equal to the number of drives “minus 1” times the smallest drive.

4. Can I change my array after I have put data on it?
The stripe size or stripe width of a level 0 or level 0+1 array can not be changed without rebuilding the array. This will cause all data to be lost. For level 1 and level 0+1 additional mirror drives can add to provide additional data security. This will not cause any data to be lost.

5. Software RAID vs. hardware RAID. Which is better?
For the most part deffinitely hardware RAID. However, software RAID has some few advantages, but its beyond the scope of this FAQ to discuss this further.

6. I want to setup a level 0 RAID. Which stripe and cluster size should I use?
It depends on what the array is going to be used for. In general if the array is used for very large files (Video streaming etc.) a larger stripe size is better. For mainstream usage (office, gaming etc.) a stripe and cluster size in the 8-32 kB range is a common choice. To some extend the optimum stripe and cluster size combination also depends on the RAID controller and drives.

7. How do I setup/partition a level 0 RAID array, and install my OS on it?
The easy way:
1) Attach the drives to the RAID controller. Each drive should be master on its own channel (separate cable) for maximum performance.
2) Enter the RAID controller bios (usually you press CTRL+H after powering on the PC). Setup the RAID0 array with your preferred stripe size. The exact way of doing this depends on the controller. Note: Some controllers (e.g. the Promise-lite) do not allow you to change the stripe size.
3) Make sure you have a floppy with the RAID drivers. Boot from the OS installation CD, and when prompted press ‘F6’ to install third party RAID or SCSI drivers. Insert the floppy.
4) Using the installation program partition and format the drive.
5) Proceed with installing the OS on the boot partition.

The problem with the above method is that you can not specify the wanted cluster size when formatting (For NTFS the default cluster size is 4kB). If you choose to use NTFS it is not possible to change the cluster size without reformatting the drive. For FAT32, the cluster size can be changed at a later time with programs like Partition Magic.

If you want to use NTFS, or do some benchmarks with different stripe and cluster size combinations the recommended method requires a third temporary drive:

1) Attach the drives to the RAID controller. Each drive should be master on its own channel (separate cable) for maximum performance.
2) Attach the temporary drive to the normal IDE controller.
3) Enter the RAID controller bios. Setup the RAID0 array with your preferred stripe size.
4) Install the OS on the temporary drive.
5) Boot on the temporary drive. When the OS is up and running, install the RAID drivers.
6) Partition and format the RAID array with the preferred cluster size. In Windows XP, Disk Management provides the means to partition drives and formatting with a custom cluster size.
7) Optionally perform benchmarks on the array. Reformat the drive with a different cluster size or rebuild the array with a different stripe size. When the array is partitioned and formatted, the temporary drive can be removed.
8 ) Make sure you have a floppy with the RAID drivers. Boot from the OS installation CD, and when prompted press ‘F6’ to install third party RAID or SCSI drivers. Insert the floppy.
9) Install the OS on the boot partition of the RAID array. Make sure you do not format the array during installation, since this will reset the cluster size to the default value.

8. Get more details about Raid, please refer to “RAID Array & Server Glossary” as below:

Read More

Recommended Western Digital External Hard Drives

WD External Hard Drives The main purposes we need an external hard drive:

  • Expand your computer’s storage capacity;
  • Backup your data and share data between computers;
  • Easy to use. Most of the time, you just need to plug it into the computer and use it as an internal hard drive.

Desktop external hard drives are based on the 3.5-inch internal hard drives and laptop (or portable) external hard drives that are based on the 2.5-inch internal hard drives. Generally, external hard drives are connected to a computer using collectively these types of connections: USB 3.0, USB 2.0, FireWire 400, FireWire 800, and eSATA.

Here’s the list of our current favorites.

Western Digital WD Elements 320 GB USB 2.0 Portable External Hard Drive

Original Price: $99.99
Price: $59.99 on Amazon.com
Model: WDBAAR3200ABK

Western Digital WD Elements 500 GB USB 2.0 Portable External Hard Drive

Original Price: $129.99
Price: $74.99 on Amazon.com
Model: WDBAAR5000ABK

Western Digital WD Elements 1 TB USB 2.0 Desktop External Hard Drive

Original Price: $129.99
Price: $74.99 on Amazon.com
Free Standard Shipping
Model: WDBAAU0010HBK

Western Digital WD Elements 1.5 TB USB 2.0 Desktop External Hard Drive

Original Price: $169.99
Price: $106.49 on Amazon.com
Model: WDBAAU0015HBK

Western Digital WD Elements 2 TB USB 2.0 Desktop External Hard Drive

Original Price: $229.99
Price: $129.99 on Amazon.com
Model: WDBAAU0020HBK

Western Digital My Passport Essential SE 1 TB USB 2.0 Portable External Hard Drive

Original Price: $199.99
Price: $139.99 on Amazon.com
Model: WDBABM0010BBK

Western Digital My Book AV DVR Expander 1 TB USB 2.0/eSATA Desktop External Hard Drive

Original Price: $127.60
Price: $119.00 on Amazon.com
Model: WDBABT0010HBK

Western Digital My DVR Expander 1 TB eSATA Desktop External Hard Drive

Original Price: $159.99
Price: $119.00 on Amazon.com
Model: WDG1S10000VN

Western Digital Elements SE 1 TB USB 2.0 Portable External Hard Drive

Original Price: $149.99
Price: $119.99 on Amazon.com
Model: WDBABV0010BBK

Western Digital My Book for Mac 1 TB USB 2.0 Desktop External Hard Drive

Original Price: $129.99
Price: $99.00 on Amazon.com
Model: WDBAAG0010HCH

See my another post: USB 2.0, USB 3.0, or FireWire – What is the recommended solution for data storage.

Read More

Slow transfer rates from my USB 2.0 drive in Windows

Problem:
USB 2.0 Hard Drive A USB 2.0 drive appears to be performing slow data transfers.

Cause:
The drive may be running at USB 1.1 speeds.

Resolution:
Make sure that your system is configured to control the drive under the USB 2.0 specification.

  • Verify that you have a USB 2.0 host adapter or motherboard.
  • If the motherboard has embedded USB 2.0 ports, be sure the drive is connected to the USB 2.0 ports. Some motherboards have 1.1 ports in one area and 2.0 ports in another (they look the same).
  • Contact the motherboard or host adapter manufacturer to verify that the proper USB 2.0 controller drivers are loaded correctly.

Note: If you’ve checked with your motherboard or computer manufacturer, and have found that you only have USB 1.1 capability, you will need to install a USB 2.0 compliant PCI card (desktops), or a USB 2.0 PCMCIA card (laptops) in order to get USB 2.0 transfer rates.

Read More

Hitachi Hard Disk Drive Business is now Western Digital

Hitachi Hard Disk Drive Business is now Western DigitalMarch 7, 2011 –  Hitachi transfers hard disk drive business to Western Digital.

Western Digital will acquire all shares of Hitachi Global Storage Technologies’s holding company, Viviti Technologies Ltd. The proposed combination will result in customer-centric storage company, with significant operating scale, strong global talent and the industry’s broadest product lineup backed by a rich technology portfolio.

Under terms of the agreement, WD will acquire Hitachi GST for $3.5 billion in cash and 25 million WD common shares valued at $750 million, based on WD closing stock price of $30.01 as of March 4, 2011. Hitachi will own approximately ten percent of WD shares and hold two seats on the WD board of directors. Steve Milligan, president and chief executive officer of Hitachi GST, will join WD’s existing senior management team as president.

The acquisition of Hitachi GST is a unique opportunity for WD to create further value for our customers, shareholders, employees, suppliers and the communities in which we operate. We believe this step will result in several key benefits-enhanced R&D capabilities, innovation and expansion of a rich product portfolio, comprehensive market coverage and scale that will enhance our cost structure and ability to compete in a dynamic marketplace. The skills and contributions of both workforces were key considerations in assessing this compelling opportunity. We will be relying on the proven integration capabilities of both companies to assure the ongoing satisfaction of our customers and to bring this combination to successful fruition.” – said John Coyne, president and chief executive officer of WD.

“This combination will bring together two industry leaders with consistent track records of strong execution and industry outperformance, together we can provide customers worldwide with the industry’s most compelling and diverse set of products and services, from innovative personal storage to Solid State Drives for the Enterprise.” – said Steve Milligan, president and chief executive officer, Hitachi GST.

Read More

Why Does Data Use More Space On Larger Drives Than Smaller Drives?

The reason the data takes up more space has to do with the cluster sizes used to store data. Microsoft operating systems using the FAT32 file system use varying cluster sizes depending on the size of the partition.

The following chart gives a breakdown of the partition/cluster size relationship using FAT32

Partition SizeCluster Size
512 MB – 8192 MB (8 GB)4 KB
8193 MB – 16384 MB8 KB
16385 MB – 32769 MB16 KB
Greater than 32769 MB32 KB

A cluster is the smallest unit used by the operating system to store data. Each piece of data, regardless of how small, uses at least one full cluster. For example, if you have a 6 GB partition in FAT32, it will have 4K clusters. If a file stored to that cluster is 3K, the entire 4K cluster will be used. On the other hand, with an 80 GB partition using 32K clusters, that same 3K file still uses one full cluster (32K). You can see that with larger cluster sizes there is the potential for more wasted space. In most cases, this is not a problem as most files will not be that small. If multiple clusters are used to accommodate a file’s size, the system will use as many clusters as necessary for the file, leaving wasted space on the last cluster used.

Clusters are sized in this way to balance speed and efficiency. If the larger partitions still used the smaller 4K clusters, utilities like ScanDisk, Defrag, etc. would take hours to complete.

Read More