Raid 1 Data Recovery

1. Raid 1 Data Recovery FAQ

Q: What is the definition of a “RAID 1” volume?
A: “RAID 1” refers to a “Redundant Array of Inexpensive (or Independent) Disks” that have been established in a Level 1, or mirrored, drive set. A RAID 1 volume is a set of disk drives that are configured for data to be written to 2 volumes simultaneously. This configuration provides complete data redundancy in the event of a drive failure.

Q: What is meant by the term “mirroring”?
A: Within a mirroring (RAID 1) volume, the exact same information that is written to one disk is also written to a second disk, creating a “mirror image”, or clone, of the orginal hard drive.

Q: What number of drives are needed for a RAID 1 volume?
A: A minimum of at least two (2) hard drives are required to create and maintain a RAID 1 volume. Unlike some other RAID configurations, RAID 1 volumes require an even number of drives to be used.

Q: What are the differences between “hardware” and “software” RAID 1 configurations?
A: With a software-based RAID 1 volume, the hard disk drives use a standard drive contoller and a software utility provides the management of the drives in the volume. A RAID 1 volume that relies on hardware for management will have a physical controller (either as an expansion card or as a part of the motherboard) that provides for the mirroring of data across the hard drives in the volume.

Q: What are the positive reasons for configuring drives as a RAID 1?
A: A RAID 1 (mirroring) set will provide redundancy, or protection against one of the drives failing during use. With a RAID 1 disk volume, information is written to the first drive and then to a second (or “mirror”) drive at the same time. If one of the hard drives in the mirror volume fails, the remaining hard drive can be placed in service as a single drive with no loss of information. Similar to a RAID 0 (striped) volume, RAID 1 volumes require a minimum of two (2) drives.

Q: What are the arguments against RAID 1 configurations?
A: RAID 1 (mirroring) results in loss of half of the physical storage capacity of the drives comprising the volume. For example, if two (2) 500GB hard drives are configured as a RAID 1 volume, only 500GB is available for data storage. Using the same drives in a RAID 0 (striped) configuration, total data storage would equal 1000GB (or approximately 1 terabyte). Also, if damaged or corrupted data is written to one drive, it is also written to the second drive. Many people mistakenly assume that they are totally protected against data loss with a RAID 1 volume, but nothing could be further from the truth. A RAID 1 volume provides a measure of protection against data loss, but it does not eliminate the need for regular backup of critical data.

Q: Can RAID 1 be combined with another type of RAID, such as RAID 0?
A: The combination of a mirrored configuration with striping added is referred to as RAID 1+0 (also called RAID 10). In this scenario, the configuration will provide mirroring (RAID 1) across two (2) or more drives and will “stripe” the data in real-time to a second mirrored drive set. This unique combination will provide data redundancy and some speed advantages, but it does so at the expense of usable storage space. A volume established as a RAID 1+0 volume provides a little more data protection than a RAID 0+1, and will need four (4) hard drives at a minimum to be configured.

Q: Can data be recovered from a re-formatted RAID 1 volume?
A: Many times information is still recoverable, depending on how the drives were re-formatted. A high-level re-format (using Windows, for example), will create what will appear to be a new “clean” volume – but the original data will still be on the disk in the “free and available” space. A low-level format routine (as performed using the controller software) will overwrite every sector, and in the process destroys the original data.

Q: Could data recovery software utilities be used to recover my RAID 1?
A: Perhaps, but it wouldn’t be the safest approach. Most data recovery software will require the read / write heads to constantly travel over areas of the original disk that, if there is any physical damage, could render the surfaces useless and beyond recovery. The safest method of recovering data from a failed or corrupted RAID 1 volume (or with any storage device) is to create a block-level copy of every sector on each hard drive. The copied image is then used to reconstruct the original volume and rescue the required files and directories. This approach, while more time consuming, maintains and preserves the physical integrity of the drive media and limits the number of times that the original drive needs to be accessed.

Q: With RAID 1, if both mirrored drives fail, can data still be recovered?
A: In many situations, data will be recoverable. The quality and integrity of the data recovered will depend on the extent of the damage incurred to each failed storage device. If the mirrored volume was operating properly up to the point of failure, then there should be identical copies of the data on at least two (2) drives which will provide 2 chances to recover the same data.

2. How Raid 1 Data Recovery?

RAID 1 creates an exact copy (or mirror) of a set of data on two or more disks. This is useful when write performance is more important than minimizing the storage capacity used for redundancy. This is thought to be a foolproof method of data protection, but we commonly receive RAID 1 arrays that have failed due to:

  • corrupted mirrors
  • bad data from one drive moves to the other drive
  • mirror breaks, and does not allow system to boot
  • improper rebuild

The array can only be as big as the smallest member disk, however. A classic RAID 1 mirrored pair contains two disks, which increases reliability by a factor of two over a single disk, but it is possible to have many more than two copies. Since each member can be addressed independently if the other fails, reliability is a linear multiple of the number of members. To truly get the full redundancy benefits of RAID 1, independent disk controllers are recommended, one for each disk. Some refer to this practice as splitting or duplexing.

When reading, both disks can be accessed independently. Like RAID 0 the average seek time is reduced by half when randomly reading but because each disk has the exact same data the requested sectors can always be split evenly between the disks and the seek time remains low. The transfer rate would also be doubled. For three disks the seek time would be a third and the transfer rate would be tripled. The only limit is how many disks can be connected to the controller and its maximum transfer speed. Many older IDE RAID 1 cards read from one disk in the pair, so their read performance is that of a single disk. Some older RAID 1 implementations would also read both disks simultaneously and compare the data to catch errors. The error detection and correction on modern disks makes this less useful in environments requiring normal commercial availability. When writing, the array performs like a single disk as all mirrors must be written with the data.

RAID 1 has many administrative advantages. For instance, in some 365*24 environments, it is possible to “Split the Mirror”: declare one disk as inactive, do a backup of that disk, and then “rebuild” the mirror. This requires that the application support recovery from the image of data on the disk at the point of the mirror split. This procedure is less critical in the presence of the “snapshot” feature of some filesystems, in which some space is reserved for changes, presenting a static point-in-time view of the filesystem. Alternatively, a set of disks can be kept in much the same way as traditional backup tapes are.