Imaging tools provide custom hardware and software solutions to the challenges of read instability.
Processing All Bytes in Sectors with Errors
While the system software works well with hard disk drives that are performing correctly, read instability must be dealt with using specialized software that bypasses the BIOS and operating system. This specialized software must be able to use ATA read commands that ignore ECC status. (These commands are only present in the ATA specification for LBA28 mode, though, and are not supported in LBA48.) Also, the software should have the capability of reading the drive error register (Figure 2), which the standard system software doesn’t provide access to. Reading the error register allows specialized software to employ different algorithms for different errors. For instance, in the case of the UNC error (“Uncorrectable Data: ECC error in data field, which could not be corrected”), the software can issue a read command that ignores ECC status. In many cases, the AMNF error can be dealt with in a similar way.
Bit 0 – Data Address Mark Not Found: During the read sector command, a data address mark was not found after finding the correct ID field for the requested sector (usually a media error or read instability).
Bit 1 – Track 0 Not Found: Track 0 was not found during drive recalibration.
Bit 2 – Aborted Command: The requested command was aborted due to a device status error.
Bit 3 – Not used (0).
Bit 4 – ID Not Found: The required cylinder, head, and sector could not be found, or an ECC error occurred in the ID field.
Bit 5 – Not used (0).
Bit 6 – Uncorrectable Data: An ECC error in the data field could not be corrected media error or read instability).
Bit 7 – Bad Mark Block: A bad sector mark was found in the ID field of the sector or an Interface CRC error occurred.
Figure 2: ATA Error Register
Only when the sector header has been corrupted (an IDNF error) is it unlikely that the sector can be read, since the drive in this case is unable to find the sector. Experience shows, however, that over 90 percent of all problems with sectors are due to errors in the data, not in the header, because the data constitutes the largest portion of the sector.
Also, the data area is constantly being rewritten, which increases read instability. Header contents usually stay constant throughout the life of the drive. As a result, over 90 percent of sectors that are unreadable by ordinary means are in fact still readable. Moreover, problem sectors usually have a small number of bytes with errors, and these errors can often be corrected by a combination of multiple reads and statistical methods.
If a sector is read ten times and a particular byte returns the same value eight times out of ten, then that value is statistically likely to be the correct one. More sophisticated statistical techniques can also be useful.
In fact, some data recovery solution providers have encountered situations in which every sector of a damaged drive had a problem. This situation would normally leave the data totally unrecoverable, but with proper disk imaging, these drives were read and corrected in their entirety.
Unfortunately, most imaging tools currently available on the market use system software to access the drive and are therefore extremely limited in their imaging capabilities. These products attempt to read a sector several times in the hope that one read attempt will complete without an ECC error. But this approach is not very successful. For example, if there is even a one percent chance of any byte being read incorrectly, the chances of reading all 512 bytes correctly on one particular read is low (about 0.58 percent). In this case, an average of 170 read attempts would be necessary to yield one successful sector read. As read instability increases, the chances of a successful read quickly become very low. For example, if there were a ten percent chance of reading any byte wrong, an average of 2.7 x 1023 read attempts would be needed for one successful one. If one read attempt took 1 ms, this number of reads would take 8000 billion years, or many times the age of the universe. No practical number of read attempts will give a realistic chance of success.
However, if imaging software can read while ignoring ECC status, a 10 percent probability of a byte being read in error is not an issue. In 10 reads, very few bytes will return less than seven or eight consistent values, making it virtually certain that this value is the correct one. Thus, imaging software that bypasses the system software can deal with read instability much more easily.