Posts Tagged ‘Disk Imaging’

Disk Imaging: Image by selective head(s)

February 8th, 2009 Comments off

Drives presented for recovery sometimes have some heads or surfaces damaged (physical damage). The problem is severe enough for the drive to stop working in its native system. However, with the latest upgraded Data Compass DCEXP utility, it is possible now (before installation of MHA replacement) to create a copy of data using the remaining good surfaces or drive heads; small files can even be recovered directly.

What’s more, this procedure will also be essential in cases:

1. When the drive is having read instability problem: by reading the surfaces one by one, it greatly reduces the seek time & times, smoothes the reading of the data, thus in some cases the original “damaged” head or surface becomes available, and users will then have no need to perform a risky MHA replacement.

2. When users have only a donor MHA with some heads damaged also: we can use the native head to read the available data first and then replace it with the donor one for reading the available data; if these two MHAs happen to be a complement to each other (for example native MHA with head 1 damaged, and donor MHA happens to have head 1 remained good), even in fact we don’t have/use a good MHA at all, we can recover all the data on the drive.

See More: Image by selective heads

Image from the HDD with serious bad sectors

January 16th, 2009 Comments off

Image from the HDD with serious bad sectors in the Data Area by Data Compass.

File Type: PDF
Size: 1630K
Download >> Image from the HDD with serious bad sectors

Solve Disk Imaging Problems (Part 3)

January 4th, 2009 Comments off

Customizing Imaging Algorithms

Consider the following conflicting factors involved in disk imaging:

§§ A high number of read operations on a failed drive increase the chances of recovering all the data, and decrease the number of probable errors in that data.

§§ Intensive read operations increase the rate of disk degradation and increase the chance of catastrophic drive failure during the imaging process.

§§ Imaging a drive can take a long time (for example, one to two weeks) depending on the intensity of the read operations. Customers with time-sensitive needs may prefer to rebuild data themselves rather than wait for recovered data.

Clearly these points suggest the idea of an imaging algorithm that maximizes the probable data recovered for a given total read activity, taking into account the rate of disk degradation and the probability of catastrophic drive failure.

However, no universal algorithm exists. A good imaging procedure depends on such things as the nature of the drive problem, and the characteristics of the vendor-specific drive firmware. Moreover, a client is often interested in a small number of files on a drive and is willing to sacrifice the others to maximize the possibility of recovering those few files.To meet these concerns, the judgment of the imaging tool operator comes into play.

Drive imaging can consist of multiple read passes. A pass is one attempt to read the entire drive, although problem sectors may be read several times on a pass or not at all, depending on the configuration. The conflicting considerations mentioned above suggest that different algorithms, or at least different parameter values, should be used on each pass.

The first pass could be configured to read only error-free sectors. There is a fair possibility that the important files can be recovered faster in this way in just one pass. Moreover, this pass will not be read-intensive since only good sectors are read and the more intensive multiple reads needed to read problem sectors are avoided. This configuration reduces the chances of degrading the disk further during the pass (including the chances of catastrophic drive failure) while having a good chance at recovering much of the data.

Second and subsequent passes can then incrementally intensify the read processes, with the knowledge that the easily-readable data have already been imaged and are safe. For instance, the second pass may attempt multiple reads of sectors with the UNC or AMNF error (Figure 2). Sectors with the IDNF error are a less promising case, since the header could not be read and hence the sector could not be found. However, even in this case multiple attempts at reading the header might result in a success, leading to the data being read. Successful data recovery of sectors with different errors depends on the drive vendor. For example, drives from some vendors have a good recovery rate with sectors with the IDNF error, while others have virtually no recovery. Prior experience comes into play here, and the software should be configurable to allow different read commands and a varying number of reread attempts after encountering a specific error (UNC, AMNF, IDNF, or ABRT).

Drive firmware often has vendor-specific error-handling routines of its own that cannot be accessed directly by the system. While you may want to minimize drive activity to speed up imaging and prevent further degradation, drive firmware increases that activity and slows down the process when faced with read instability. To minimize drive activity, imaging software must implement a sector read timeout, which is a user-specified time before a reset command is sent to the drive to stop processing the current sector.

For example, you notice that good sectors are read in 10 ms. If this is a first pass, and your policy is to skip problem sectors at this point, the read timeout value might be 20 ms. If 20 ms have elapsed and the data has not yet been read, the sector is clearly corrupted in one way or another and the drive firmware has invoked its own error-handling routines. In other words, a sector read timeout can be used to identify problem sectors. If the read timeout is reached, the imaging software notes the sector and sends a reset command. After the drive cancels reading the current sector, the read process continues at the next sector.

By noting the sectors that timeout, the software can build up a map of problem sectors. The imaging algorithm can use this information during subsequent read passes.

In all cases the following parameters should be configurable:

  • Type of sectors read during this pass
  • Type of read command to apply to a sector
  • Number of read attempts
  • Number of sectors read per block
  • Sector read timeout value
  • Drive ready timeout value
  • Error-handling algorithm for problem sectors

Other parameters may also be configurable but this list identifies the most critical ones.

Imaging Hardware Minimizes Damage
In addition to the software described above, data recovery professionals also need specialized hardware to perform imaging in the presence of read instability. Drive firmware is often unstable in the presence of read instability, which may cause the drive to stop responding. To resolve this issue, the imaging system must have the ability to control the IDE reset line if the drive becomes unresponsive to software commands. Since modern computers are equipped with ATA controllers that do not have ability to control the IDE reset line, this functionality must be implemented with a specialized hardware. In cases where a drive does not even respond to a hardware reset, the hardware should also be able to repower the drive to facilitate a reset.

If the system software cannot deal with an unresponsive hard drive, it will also stop responding, requiring you to perform a manual reboot of the system each time in order to continue the imaging process. This issue is another reason for the imaging software to bypass the system software.

Both of these reset methods must be implemented by hardware but should be under software control. They could be activated by a drive ready timeout. Under normal circumstances the read timeout sends a software reset command to the drive as necessary. If this procedure fails and the drive ready timeout value is reached, the software directs the hardware to send a hardware reset, or to repower the drive. A software reset is least taxing on repower method is most taxing. A software reset minimizes drive activity while reading problem sectors, which reduces additional wear. A hardware reset or the repower method deals with an unresponsive hard drive.

Moreover, because reset methods are under software control via the user-configurable timeouts, the process is faster and there is no need for constant user supervision.

The drive ready timeout can also reduce the chances of drive self-destruction due to head-clicks, which is a major danger in drives with read instability. Head-clicks are essentially a firmware exception in which repeated improper head motion occurs, usually of large amplitude leading to rapid drive self-destruction. Head-clicks render the drive unresponsive and thus the drive ready timeout is reached and the software shuts the drive down, hopefully before damage has occurred. A useful addition to an imaging tool is the ability to detect head-clicks directly, so it can power down the drive immediately without waiting for a timeout, thus virtually eliminating the chances of drive loss.

Solve Disk Imaging Problems (Part 2)

January 4th, 2009 Comments off

Disabling Auto-Relocation and SMART Attribute Processing

While the methods outlined in the previous section go a long way to obtaining an image of the data, other problems remain.

When drive firmware identifies a bad sector, it may remap the sector to a reserved area on the disk that is hidden from the user (Figure 3). This remapping is recorded in the drive defects table (G-list). Since the bad sector could not be read, the data residing in the substitute sector in the reserved area is not the original data. It might be null data or some other data in accordance with the vendor-specific firmware policy, or even previously remapped data in the case where the G-list was modified due to corruption.

Moreover, system software is unaware of the remapping process. When the drive is asked to retrieve data from a sector identified as bad, the drive firmware may automatically redirect the request to the alternate sector in the reserved area, without notifying the system before the error is returned. This redirection occurs despite the fact that the bad sector is likely still readable and only contains a small number of bytes with errors.

disk imaging
Figure 3: G-List Remapping
This process performed by drive firmware is known as bad sector auto-relocation. This process can and should be turned off before the imaging process begins. Auto-relocation on a drive with read instability not only obscures instances when non-original data is being read, it is also time-consuming and increases drive wear, possibly leading to increased read instability.

Effective imaging software should be able to turn off auto-relocation so that it can identify problem sectors for itself and take appropriate action, which ensures that the original data is being read.

Unfortunately, the ATA specification does not have a command to turn off auto-relocation. Therefore imaging software should use vendor-specific ATA commands to do this.

A similar problem exists with Self-Monitoring Analysis and Reporting Technology (SMART) attributes. The drive firmware constantly recalculates SMART attributes and this processing creates a large amount of overhead that increases imaging time and the possibility of further drive degradation. Imaging software should be able to disable SMART attribute processing.

Other drive preconfiguration issues exist, but auto-relocation and SMART attributes are among the most important that imaging software should address.

Increasing Transfer Speed with UDMA Mode

Modern computers are equipped with drives and ATA controllers that are capable of the Ultra Direct Memory Access (UDMA) mode of data transfer. With Direct Memory Access (DMA), the processor is freed from the task of data transfers to and from memory. UDMA can be thought of as an advanced DMA mode, and is defined as data transfer occurring on both the rise and fall of the clock pulse, thus doubling the transfer speed compared to ordinary DMA.

Both DMA and UDMA modes are in contrast to the earlier Programmed Input Output (PIO) mode in which the processor must perform the data transfer itself. Faster UDMA modes also require an 80-pin connector, instead of the 40-pin connector required for slower UDMA and DMA.

The advantages are obvious. Not only does UDMA speed up data transfer, but the processor is free to perform other tasks.

While modern system software is capable of using UDMA, imaging software should be able to use this data transfer mode on a hardware-level (bypassing system software) as well. If the source and destination drives are on separate IDE channels, read and write transfers can occur simultaneously, doubling the speed of the imaging process. Also, with the computer processor free, imaged data can be processed on the fly. These two advantages can only be achieved if the imaging software handles DMA/UDMA modes, bypassing system software. Most imaging tools currently available on the market use system software to access the drive and so don’t have these advantages.

Solve Disk Imaging Problems (Part 1)

January 4th, 2009 Comments off

Imaging tools provide custom hardware and software solutions to the challenges of read instability.

Processing All Bytes in Sectors with Errors

While the system software works well with hard disk drives that are performing correctly, read instability must be dealt with using specialized software that bypasses the BIOS and operating system. This specialized software must be able to use ATA read commands that ignore ECC status. (These commands are only present in the ATA specification for LBA28 mode, though, and are not supported in LBA48.) Also, the software should have the capability of reading the drive error register (Figure 2), which the standard system software doesn’t provide access to. Reading the error register allows specialized software to employ different algorithms for different errors. For instance, in the case of the UNC error (“Uncorrectable Data: ECC error in data field, which could not be corrected”), the software can issue a read command that ignores ECC status. In many cases, the AMNF error can be dealt with in a similar way.

disk imaging

Bit 0 – Data Address Mark Not Found: During the read sector command, a data address mark was not found after finding the correct ID field for the requested sector (usually a media error or read instability).

Bit 1 – Track 0 Not Found: Track 0 was not found during drive recalibration.
Bit 2 – Aborted Command: The requested command was aborted due to a device status error.
Bit 3 – Not used (0).
Bit 4 – ID Not Found: The required cylinder, head, and sector could not be found, or an ECC error occurred in the ID field.
Bit 5 – Not used (0).
Bit 6 – Uncorrectable Data: An ECC error in the data field could not be corrected media error or read instability).
Bit 7 – Bad Mark Block: A bad sector mark was found in the ID field of the sector or an Interface CRC error occurred.

Figure 2: ATA Error Register

Only when the sector header has been corrupted (an IDNF error) is it unlikely that the sector can be read, since the drive in this case is unable to find the sector. Experience shows, however, that over 90 percent of all problems with sectors are due to errors in the data, not in the header, because the data constitutes the largest portion of the sector.

Also, the data area is constantly being rewritten, which increases read instability. Header contents usually stay constant throughout the life of the drive. As a result, over 90 percent of   sectors that are unreadable by ordinary means are in fact still readable. Moreover, problem sectors usually have a small number of bytes with errors, and these errors can often be corrected by a combination of multiple reads and statistical methods.

If a sector is read ten times and a particular byte returns the same value eight times out of ten, then that value is statistically likely to be the correct one. More sophisticated statistical techniques can also be useful.

In fact, some data recovery solution providers have encountered situations in which every sector of a damaged drive had a problem. This situation would normally leave the data totally unrecoverable, but with proper disk imaging, these drives were read and corrected in their entirety.

Unfortunately, most imaging tools currently available on the market use system software to access the drive and are therefore extremely limited in their imaging capabilities. These products attempt to read a sector several times in the hope that one read attempt will complete without an ECC error. But this approach is not very successful. For example, if there is even a one percent chance of any byte being read incorrectly, the chances of reading all 512 bytes correctly on one particular read is low (about 0.58 percent). In this case, an average of 170 read attempts would be necessary to yield one successful sector read. As read instability increases, the chances of a successful read quickly become very low. For example, if there were a ten percent chance of reading any byte wrong, an average of 2.7 x 1023 read attempts would be needed for one successful one. If one read attempt took 1 ms, this number of reads would take 8000 billion years, or many times the age of the universe. No practical number of read attempts will give a realistic chance of success.

However, if imaging software can read while ignoring ECC status, a 10 percent probability of a byte being read in error is not an issue. In 10 reads, very few bytes will return less than seven or eight consistent values, making it virtually certain that this value is the correct one. Thus, imaging software that bypasses the system software can deal with read instability much more easily.