Top 10 Data Recovery Bloopers

1. People Are the Problem, Not Technology
Disk drives today are typically reliable – human beings aren’t. A recent study found that approximately 15 percent of all unplanned downtime occurs because of human error.

2. When Worlds Collide
The company’s high-level IT executives purchased a “Cadillac” system, without knowing much about it. System implementation was left to a young and inexperienced IT team. When the crisis came, neither group could talk to the other about the system.

3. An Almost Perfect Plan
The company purchased and configured a high-end, expensive, and full-featured library for the company’s system backups. Unfortunately, the backup library was placed right beside the primary system. When the primary system got fried, so too did the backup library.

4. When the Crisis Deepens, People Do Sillier Things
When the office of a civil engineering firm was devastated by floods, its owners sent 17 soaked disks from three RAID arrays to a data recovery lab in plastic bags. For some reason, someone had frozen the bags before shipping them. As the disks thawed, even more damage was done.

5. It’s the Simple Things That Matter
The client, a successful business organization, purchased a “killer” UNIX network system, and put 300+ workers in place to manage it. Backups were done daily. Unfortunately, no one thought to put in place a system to restore the data to.

6. Buy Cheap, Pay Dearly
The organization bought an IBM system – but not from IBM. Then the system manager decided to configure the system uniquely, rather than following set procedures. When things went wrong with the system, it was next to impossible to recreate the configuration.

7. Lights Are On, But No One’s Home
A regional-wide ambulance monitoring system suffered a serious disk failure, only to discover that its automated backup hadn’t run for fourteen months. A tape had jammed in the drive, but no one had noticed.

8. Hit Restore and All Will Be Well
After September’s WTC attacks, the company’s IT staff went across town to their backup system. They invoked Restore, and proceed to overwrite from the destroyed main system. Of course, all previous backups were lost.

9. In a Crisis, People Do Silly Things
The prime server in a large urban hospital’s system crashed. When minor errors started occurring, system operators, instead of gathering data about the errors, tried anything and everything, including repeatedly invoking a controller function which erased the entire RAID array data.

10The Truth, and Nothing But the Truth
After a data loss crisis, the company CEO and the IT staffer met with the data recovery team. No progress was made until the CEO was persuaded to leave the room. Then the IT staffer opened up, and solutions were developed.

Read More

HTML files and Text Files

After all known compound file formats have been carved, their sectors are bookmarked and removed from consideration as possibly belonging to text, HTML or any other files.  Using the “gather text” feature of X-Ways Forensics (or similar feature from a variety of existing forensic tools), text was extracted from the remaining sectors not bookmarked.

All .html and .txt files were manually carved and evaluated since no compound file format exists, identifying start, end, or location of structures within the file(s).  Any fragmented text or .html files were manually put back together based on manual review of the content of the files.

Read More

JPEG Files

JPEG FilesNext we will look at carving JPEG graphic files, as specified in the document “Description of Exif file format.” For complete details of the file format specification, please refer to the hyperlink to the document, listed on page 1 of this paper.

The JPEG graphic file starts with a Start of Image (SOI) signature of “FF D8”.  Following the SOI are a series of “Marker” blocks of data used for file information.  Each of these “Markers” begin with a signature “FF XX”, where “XX” identifies the type of marker.  The 2 bytes following each  marker header is the size of the marker data.  The marker data immediately follows the size and then the next marker header “FF XX” immediately follows the previous marker data.  There is no standard as to how many markers will exist, but following the markers, the signature “FF DA” indicates the “Start of Stream” marker.  The SOS marker is followed by a 2-byte value of the size of the SOS data and is immediately followed by the Image stream that makes up the graphic. The end of the image stream is marked by the signature “FF D9”.

In the event that a thumbnail graphic exists within the file, the thumbnail graphic will have the exact same components as the full-size graphic, with “FF D8” indicating the start of the thumbnail and “FF D9”, indicating the end of the thumbnail.  Since thumbnails are significantly smaller and less likely to experience fragmentation than their larger parent full-size graphic, they can be used as a comparison tool for evaluating what the entire jpeg graphic is supposed to look like, in the event you must do a manual visual review of the carved graphic.

By searching first for all locations of the “FF D8 FF” signature, you identify the beginning of each jpeg graphic. The reason for searching for “FF D8 FF” is that there are different versions of jpeg  files, some that start with “FF D8 FF E0” and some with “FF D8 FF E1”, and leaving off the 4th byte in your signature will catch all instances, but may result in some false hits.

Rather than carve a specific length of data, in this case we will start at the beginning signatureand carve until we find “FF D9”.  In the event of a non-fragmented jpeg graphic, without a thumbnail, this will carve the whole file.  If we slightly modify our logic, by including a “if  “FF D8” occurs again before “FF D9”, then carve to the 2nd instance of “FF D9″” statement in our search for jpegs, then we will carve entire files including their thumbnail as long as they are not fragmented.  Without this “if” logic, the first search would stop carving at the end of the thumbnail and result in an invalid jpeg.  In the event of a fragmented jpeg file, the above carving method results in either a partial jpeg file or a complete jpeg file that contains extraneous data in the middle of it.

After carving all jpeg files based on these rules, we next quickly review which carved jpeg files are complete, versus which ones are fragmented and need further analysis.  By carving all jpeg files to a folder, you next add that folder to your forensic tool that has partial graphic file viewing capabilities, such as the “Outside In” viewer that is built into many existing forensic tools.  Using a gallery view, you can quickly identify which files are not displaying properly, only showing a partial file, and require further analysis.

Once all fragmented or partial jpegs are identified, manual visual inspection of each of these files was used to determine at what point the fragmentation occurred.  This was done by approximating the percentage of the file that displayed correctly in the viewer before displaying  corruptly.  The raw data of the carved file was then reviewed at the data at that percentage of the file to attempt to identify where the valid graphic data ended.  For this process it was assumed that the extraneous data started at an offset that was a multiple of 512-bytes from the beginning of the file.  Once the extraneous data was identified, it was then removed from the partial jpeg and re-evaluated as possible sector data for other fragmented files that had previously been identified

Read More

MS Compound Document Files

(Includes documents, spreadsheets, templates and other MS office files)

MS OfficeNext we will look at carving MS Compound Document (and spreadsheet) files, as specified in the document “Open Office.org’s Documentation of the Microsoft Compound Document File Format.” For complete details of the file format specification, please refer to the hyperlink to the document, listed on page 1 of this paper.

As quoted from the above referenced document, “Compound document files are used to structure the contents of a document in the file.  It is possible to divide the data into several streams, and to store these streams in different storages in the file.  This way compound document files support a complete file system inside the file, the streams are like files in a real file system, and the storages are like sub-directories.”

All streams of a compound document file are divided into sectors. Sectors may contain internal control data of the compound document or parts of the user data.  The entire file consists of a compound document header and a list of all sectors following the header..  The size of the sectors can be set in the header and is fixed for all sectors then.

Example:
 HEADER
SECTOR 0
SECTOR 1
SECTOR 2
SECTOR 3
SECTOR 4
SECTOR 5
SECTOR 6
…and so on…

As we discussed in the section on Zip files, if you know what you are looking for, and where you expect to find it within the file, you can determine exactly what data belongs to the file in question and whether or not there is fragmented data within the file.

We start by searching for the Compound Document Header, “D0 CF 11 E0 A1 B1 1A E1,” to identify the beginning of each of the MS compound documents.  Next, at offset 0x1E from the beginning of the header we find a 2-byte value that identifies the sector size used in the document, which is usually 512-bytes/sector.  Now, knowing the size of each sector that makes up the file, we can start looking for document structures and where within the file they should be located.  As noted in the Zip file process mentioned earlier in this paper, the difference between the EXPECTED location of a structure and its ACTUAL location is the size of the fragmented data that doesn’t belong to the file.

At file offset 0x2C, we find the # of sectors used by the Sector Allocation Table (SAT).  Next, at file offset 0x30 we find the starting sector number (within the file) of the file’s Directory.  Another important file structure is the Short-Sector Allocation Table (SSAT), whose starting sector # is located at file offset 0x3C, followed by the number of sectors making up the SSAT, located at file offset 0x40.  Not all compound documents utilize a SSAT, in which case you can ignore these 8 bytes.  And lastly, we look at the Master Sector Allocation Table (MSAT), whose starting sector # is located at file offset 0x44, followed by the number of sectors making up the MSAT, located at file offset 0x48.  The following 436 bytes of data, which make up the rest of the first 512 bytes of the compound document file, contain the first 109 sector IDs (SID) of the MSAT and starts at file offset 04C.

So, now that you know where certain items should be located, the next step is to located them on the disk and find out if they are located at the expected sector number in relation to the start of the document.

First, using the first sector of the MSAT from the 4-byte value at offset 0x4C, search for “01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00” to find the beginning of the MSAT and compare the sector number you find the MSAT located at with the results of the sector # of the start of the document plus the 4-byte value at offset 0x4C.  If there is a difference, then a fragmentation occurs before the start of the MSAT.

Secondly, search forward for the beginning of the Directory, starting from the document’s header. The signature for the start of the Directory is “52 00 6F 00 6F 00 74 00 20 00 45 00 6E 00 74 00 72 00 79 00” (or “Root Entry” in case sensitive Unicode).  There may be left over instances of previous Directory Entries from previous file edits, so look for more than one instance of the “Root Entry”.  Once you find the sector # of the start of the Directory, subtract the sector # of the start of the document, and compare the result against the 4-byte value at file offset 0x30.  If the result matches your 4-byte value then no fragmentation exists between the start of the file and the Directory.  If there is a difference, the difference is the amount of fragmented data that doesn’t belong to the document.

And lastly, review of the individual Directory Entries for the starting sector numbers and stream size of the objects will assist in determining where, before or after each object, any file fragmentation occurs.

The largest object within the compound document is most likely the “WordDocument” object, or”Workbook” object for spreadsheets.  Which means that if fragmentation exists within a large compound document, it is likely that the fragmentation occurs within those streams.  As was mentioned earlier, through a process of elimination and/or manual review of the carved block for a block of data the size of your determined fragment for data that doesn’t belong to the document.

The directory is an array of directory entries.  Each directory entry is a 128-byte entry and is listed in order of their appearance in the document.  It identifies the starting sector # of that file object, at directory entry offset 0x74 and the size of that object (in bytes) at offset 0x78.

Read More

Zip Files

Zip FilesThe first compound file format that we will look are Zip files, as specified in the document “APPNOTE.TXT – .ZIP File Format Specification”, revision date January 6, 2006 from PKWARE,Inc. For complete details of the file format specification, please refer to the hyperlink to the document, listed on page 1.  The information described below applies to most common Zip files created with current versions of Zip archive utilities, such as WinZip.

A Zip file is broken into specific parts that can be searched for and identified based on separate signatures. The basic layout of a Zip file is first the individual compressed files within the archive.

These individual files are known as “local files” and start with a local file’s decryption header of”50 4B 03 04″, followed by the file data for the compressed local file and then followed by a data descriptor, which can be identified by the signature “50 4B 07 08”.  This sequence of decryption header followed by file data, followed by data descriptor continues for each local file within the archive.  “The decryption header will contain the value of the local file’s compressed file size, which includes the bytes of the decryption header, unless bit 3 of a 2-byte general purpose flag located at offset 0x06 in the decryption header is set.  If this bit is set, then the compressed size is stored in the “data descriptor” that immediately follows the local file’s data, and is also stored in a central directory record for the local file, as part of the central directory located that is after all individual local files in the archive.

The central directory at the end of each Zip archive can be identified by searching for the signature “50 4B 01 02”, which identifies the beginning of each central directory record contained within the central directory.  And lastly, the signature “50 4B 05 06” identifies the “End of the Central Directory Record”, which identifies the size in bytes of the central directory and it’s starting offset location in relation to the beginning of the first local file decryption header in the archive.

Upon identifying the signature “50 4B 05 06”, and using the size and starting offset information in the “End of Central Directory Record”, you search backwards from the beginning of the “”50 4B 05 06” the correct number of bytes (directory size + starting offset) and determine if that leaves you at the signature “50 4B 03 04”, which is the beginning of the first local file and the start of the archive.

The same search can also be performed in a forward manner, starting at the first “50 4B 03 04” you find and searching forward to the first “50 4B 05 06” you find and comparing the distance between the two with the result of the directory size + starting offset, located at offset 0x0C of the “End of Central Directory Record”.

If the location of the “End of Central Directory Record” is at a further offset than your calculation, then you have a fragmented archive file.  The difference between the actual locationyou’re your calculation is the size of the fragmented block of data that doesn’t belong to the archive file.  The next step is determining where the fragment occurs and distinguishing between the archive data and the fragment(s) that don’t belong to the file.

To do this we next look at the data descriptor, if present, at the end of each local file in the archive, or the individual central directory records for each local file in the central directory.  The compressed size of the local file, which includes the size of the decryption header for the local file, is locate at offset 0x14 of each individual central directory record, which starts with the signature “50 4B 01 02.”

Once you have determined the starting point of each local file in the archive, from its signature”50 4B 03 04″ and you have determined the length of the local file from either the data descriptor at the end of the local file or from the length stored in its central directory record at the end of the archive, you can now determine which individual local file(s) contain the portion of the overall archive that is fragmented.

Starting from the first local file decryption header and going forward by the “size of compressed file” found in either of the two above locations, we should find the start of the next local file decryption header.  If this brings you to the start of the next decryption header then this first local file is not fragmented. Continue with this method until there is a difference between the expected start of the next local file decryption header and the ACTUAL start of the specific local file decryption header.  The size of the difference is the amount of fragmentation that has occurred. This difference is compared with the overall difference noted earlier between the overall size of the archive and the location of the “End of Central Directory Record” to determine if this is the entire amount of fragmentation within the archive or if more instances of fragmentation exist in another of the local files in the archive.

Once all individual local files in the archive, that contain fragmentation, are identified, and the size of the fragmentation is noted, you now review sectors of the fragmented local files for a block of data the size of the identified fragment that doesn’t belong.  This can sometimes be more difficult to determine than other times, depending on the type of the fragmented data.

Read More

Top 10 Data Recovery Companies

Secure Data Recovery (Recommended, USA)
Data Recovery Service provider with data recovery lab locations located throughout the United States. Our experience in the data recovery industry is unmatched. We have been operating since 1997 and offer world-class service and support. Our team of data recovery professionals are experts in providing advanced data recovery solutions. Our network of data recovery specialists provide: fast, friendly, accurate and reliable service.
http://www.securedatarecovery.com/

1. Kroll Ontrack®
Kroll Ontrack provides technology-driven services and software to help recover, search, analyze and produce data efficiently and cost-effectively. Commonly bridging the gap between technical and business professionals, Kroll Ontrack services a variety of customers in the legal, government, corporate and financial markets around the world.
http://www.ontrackdatarecovery.com/

2. R-Tools Technology Inc.
The leading provider of powerful data recovery, undelete, drive image, data security and PC privacy utilities for the Windows OS family.
http://www.r-tt.com/

3. DTIData
DTI DATA is the industry’s premier data recovery service and recovery software company for both physical and logical hard drive recovery.
http://www.dtidata.com/

4. SalvageData
SalvageData is the first and only US based ISO 9001:2000 certified data salvaging & recovery service lab in North America specializing in advanced data salvaging and recovery from all digital media storage types and formats.
http://salvagedata.com/

5. DriveSavers
DriveSavers is the worldwide leader in data recovery services and provides the fastest, most secure and reliable data recovery service available.
http://www.drivesavers.com/

6. Stellar Information Systems Limited
Stellar Information Systems Limited is an ISO 9001-2000 certified company specializing in data recovery and data protection services and solutions.
http://www.stellarinfo.com/

7. Data Clinic Ltd
Data Clinic Ltd provide you with a professional, cost effective and prompt data recovery service from crashed hard disks and other computer based media.
http://www.dataclinic.co.uk/

8. First Advantage Data Recovery Services (DRS)
With more than 25 years involvement in hard drive data recovery, Data Recovery Services has and will continue to lead the industry in those areas.
http://www.datarecovery.net/

9. CBL
CBL provide data recovery for failed hard drives in laptops, desktop computers, data servers, RAID arrays, tapes and all other data storage media.
http://www.cbltech.com/

10. Data Recovery Centre (ADRC) Pte Ltd
Adroit Data Recovery Centre (ADRC) Pte Ltd is the data recovery expert established since 1998.
http://www.adrc.net

Top 10 Data Recovery Company

Read More

What is Data Carving?

Data Carving is a technique used in the field of  Computer Forensics when data can not be identified or extracted from media by “normal” means due to the fact that the desired data no longer has file system allocation information available to identify the sectors or clusters that belong to the file or data.

Currently the most popular method of Data Carving involves the search through raw data for the file signature(s) of the file types you wish to find and carve out.  Since the file system has no information on the size of the file being carved, the current methods involve specifying a block size of data to “carve” upon finding the desired signature.

This current method relies on some assumptions:

1) that the beginning of the file, which is where the signature resides, is still present;

2) the signature you are searching for is not so common that you would find the string of characters in many other files, thereby creating many “false hits”; and

3) that the files identified through the signature search are contiguous and not fragmented.

In addition to the issue listed in the previous paragraph, the current Data Carving methods also rely on the user making adjustments to the “block size” they are carving out for a specific fill signature.

As files are identified through a search, the files are typically manually reviewed by opening in a program capable of viewing the specified file type.  This manual review gives the examiner an idea if they need to “carve” a larger or smaller block of data for a given file in order to carve the file in its entirety.

This current process is not optimal, as it relies on guess work and a lot of trial and error on the part of the forensic examiner.

In this paper, submitted for the 2006 DFRWS Data Carving Challenge, I will look at the process of Advanced (Smart) Data Carving, which removes the “guess work” when carving certain compound file formats that contain information about the size and layout of the file in question,  regardless of the existence of file system allocation information for the file.

The below documents, detailing the various file format specifications, were used to manually carve all files, listed on pages 1-2 of this submission, from the file “dfrws-2006-challenge.raw.”

X-Ways Forensics which is used to manually carve and hash all files.

http://www.x-ways.net/forensics/index-m.html

Office Document File Format Specification

http://sc.openoffice.org/compdocfileformat.pdf

Exif/Jpg File Format Specification

http://www.media.mit.edu/pia/Research/deepview/exif.html

Zip File Format Specification

http://www.pkware.com/business_and_developers/developer/popups/appnote.txt

Read More

Why did data loss?

Physical damage

A wide variety of failures can cause physical damage to storage media. CD-ROMs can have their metallic substrate or dye layer scratched off; hard disks can suffer any of several mechanical failures, such as head crashes and failed motors; and tapes can simply break. Physical damage always causes at least some data loss, and in many cases the logical structures of the file system are damaged as well. This causes logical damage that must be dealt with before any files can be recovered.

Most physical damage cannot be repaired by end users. For example, opening a hard disk in a normal environment can allow dust to settle on the surface, causing further damage to the platters. Furthermore, end users generally do not have the hardware or technical expertise required to make these sorts of repairs; therefore, data recovery companies are consulted. These firms use Class 100 clean room facilities to protect the media while repairs are made, and tools such as magnetometers to manually read the bits off failed magnetic media. The extracted raw bits can be used to reconstruct a disk image, which can then be mounted to have its logical damage repaired. Once that is complete, the files can be extracted from the image.

Logical damage

Far more common than physical damage is logical damage to a file system. Logical damage is primarily caused by power outages that prevent file system structures from being completely written to the storage medium, but problems with hardware (especially RAID controllers) and drivers, as well as system crashes, can have the same effect. The result is that the file system is left in an inconsistent state. This can cause a variety of problems, such as strange behavior (e.g., infinitely recursion directories, drives reporting negative amounts of free space), system crashes, or an actual loss of data. Various programs exist to correct these inconsistencies, and most operating systems come with at least a rudimentary repair tool for their native file systems. Linux, for instance, comes with the fsck utility, and Microsoft Windows provides chkdsk. Third-party utilities are also available, and some can produce superior results by recovering data even when the disk cannot be recognized by the operating system’s repair utility.

Two main techniques are used by these repair programs. The first, consistency checking, involves scanning the logical structure of the disk and checking to make sure that it is consistent with its specification. For instance, in most file systems, a directory must have at least two entries: a dot (.) entry that points to itself, and a dot-dot (..) entry that points to its parent. A file system repair program can read each directory and make sure that these entries exist and point to the correct directories. If they do not, an error message can be printed and the problem corrected. Both chkdsk and fsck work in this fashion. This strategy suffers from a major problem, however; if the file system is sufficiently damaged, the consistency check can fail completely. In this case, the repair program may crash trying to deal with the mangled input, or it may not recognize the drive as having a valid file system at all.

The second technique for file system repair is to assume very little about the state of the file system to be analyzed and to, using any hints that any undamaged file system structures might provide, rebuild the file system from scratch. This strategy involves scanning the entire drive and making note of all file system structures and possible file boundaries, then trying to match what was located to the specifications of a working file system. Some third-party programs use this technique, which is notably slower than consistency checking. It can, however, recover data even when the logical structures are almost completely destroyed. This technique generally does not repair the underlying file system, but merely allows for data to be extracted from it to another storage device.

While most logical damage can be either repaired or worked around using these two techniques, data recovery software can never guarantee that no data loss will occur. For instance, in the FAT file system, when two files claim to share the same allocation unit (“cross-linked”), data loss for one of the files is essentially guaranteed.

The increased use of journaling file systems, such as NTFS 5.0, ext3, and xfs, is likely to reduce the incidence of logical damage. These file systems can always be “rolled back” to a consistent state, which means that the only data likely to be lost is what was in the drive’s cache at the time of the system failure. However, regular system maintenance should still include the use of a consistency checker in case the file system software has an error that may cause data corruption. Also, in certain situations even the journaling file systems can not guarantee consistency. For instance, if the physical media disk used delays the writing back of data or reorders it in ways invisible to the file system (for instance, some disks lie about the changes being flushed to the disk, saying they have been flushed when they actually haven’t) a power loess may cause such errors to occur (note that this is usually not a problem if the delay/reordering is done by the file system software’s own caching mechanisms). The solution is to use hardware that doesn’t report data as written until it actually is written or using disk controllers equipped with a battery backup so that the waiting data can be written when power is restored. Alternatively, the entire system can be equipped with a battery backup (UPS) that may make it possible to keep the system on in such situations, or at least give some time to have it shut down properly.

And BACKUP YOUR DATA is a good way to protect data.

But backup technology and practices have failed to adequately protect data. Most computer users rely on backups and redundant storage technologies as their safety net in the event of data loss. For many users, these backups and storage strategies work as planned. Others, however, are not so lucky. Many people back up their data, only to find their backups useless in that crucial moment when they need to restore from them. These systems are designed for and rely upon a combination of technology and human intervention for success. For example, backup systems assume that the hardware is in working order. They assume that the user has the time and the technical expertise necessary to perform the backup properly. They also assume that the backup tape or CD-RW is in working order, and that the backup software is not corrupted. In reality, hardware can fail. Tapes and CD-RW do not always work properly. Backup software can become corrupted. Users accidentally back up corrupted or incorrect information. Backups are not infallible and should not be relied upon absolutely.

Read More

Solve Disk Imaging Problems (Part 3)

Customizing Imaging Algorithms

Consider the following conflicting factors involved in disk imaging:

§§ A high number of read operations on a failed drive increase the chances of recovering all the data, and decrease the number of probable errors in that data.

§§ Intensive read operations increase the rate of disk degradation and increase the chance of catastrophic drive failure during the imaging process.

§§ Imaging a drive can take a long time (for example, one to two weeks) depending on the intensity of the read operations. Customers with time-sensitive needs may prefer to rebuild data themselves rather than wait for recovered data.

Clearly these points suggest the idea of an imaging algorithm that maximizes the probable data recovered for a given total read activity, taking into account the rate of disk degradation and the probability of catastrophic drive failure.

However, no universal algorithm exists. A good imaging procedure depends on such things as the nature of the drive problem, and the characteristics of the vendor-specific drive firmware. Moreover, a client is often interested in a small number of files on a drive and is willing to sacrifice the others to maximize the possibility of recovering those few files.To meet these concerns, the judgment of the imaging tool operator comes into play.

Drive imaging can consist of multiple read passes. A pass is one attempt to read the entire drive, although problem sectors may be read several times on a pass or not at all, depending on the configuration. The conflicting considerations mentioned above suggest that different algorithms, or at least different parameter values, should be used on each pass.

The first pass could be configured to read only error-free sectors. There is a fair possibility that the important files can be recovered faster in this way in just one pass. Moreover, this pass will not be read-intensive since only good sectors are read and the more intensive multiple reads needed to read problem sectors are avoided. This configuration reduces the chances of degrading the disk further during the pass (including the chances of catastrophic drive failure) while having a good chance at recovering much of the data.

Second and subsequent passes can then incrementally intensify the read processes, with the knowledge that the easily-readable data have already been imaged and are safe. For instance, the second pass may attempt multiple reads of sectors with the UNC or AMNF error (Figure 2). Sectors with the IDNF error are a less promising case, since the header could not be read and hence the sector could not be found. However, even in this case multiple attempts at reading the header might result in a success, leading to the data being read. Successful data recovery of sectors with different errors depends on the drive vendor. For example, drives from some vendors have a good recovery rate with sectors with the IDNF error, while others have virtually no recovery. Prior experience comes into play here, and the software should be configurable to allow different read commands and a varying number of reread attempts after encountering a specific error (UNC, AMNF, IDNF, or ABRT).

Drive firmware often has vendor-specific error-handling routines of its own that cannot be accessed directly by the system. While you may want to minimize drive activity to speed up imaging and prevent further degradation, drive firmware increases that activity and slows down the process when faced with read instability. To minimize drive activity, imaging software must implement a sector read timeout, which is a user-specified time before a reset command is sent to the drive to stop processing the current sector.

For example, you notice that good sectors are read in 10 ms. If this is a first pass, and your policy is to skip problem sectors at this point, the read timeout value might be 20 ms. If 20 ms have elapsed and the data has not yet been read, the sector is clearly corrupted in one way or another and the drive firmware has invoked its own error-handling routines. In other words, a sector read timeout can be used to identify problem sectors. If the read timeout is reached, the imaging software notes the sector and sends a reset command. After the drive cancels reading the current sector, the read process continues at the next sector.

By noting the sectors that timeout, the software can build up a map of problem sectors. The imaging algorithm can use this information during subsequent read passes.

In all cases the following parameters should be configurable:

  • Type of sectors read during this pass
  • Type of read command to apply to a sector
  • Number of read attempts
  • Number of sectors read per block
  • Sector read timeout value
  • Drive ready timeout value
  • Error-handling algorithm for problem sectors

Other parameters may also be configurable but this list identifies the most critical ones.

Imaging Hardware Minimizes Damage
In addition to the software described above, data recovery professionals also need specialized hardware to perform imaging in the presence of read instability. Drive firmware is often unstable in the presence of read instability, which may cause the drive to stop responding. To resolve this issue, the imaging system must have the ability to control the IDE reset line if the drive becomes unresponsive to software commands. Since modern computers are equipped with ATA controllers that do not have ability to control the IDE reset line, this functionality must be implemented with a specialized hardware. In cases where a drive does not even respond to a hardware reset, the hardware should also be able to repower the drive to facilitate a reset.

If the system software cannot deal with an unresponsive hard drive, it will also stop responding, requiring you to perform a manual reboot of the system each time in order to continue the imaging process. This issue is another reason for the imaging software to bypass the system software.

Both of these reset methods must be implemented by hardware but should be under software control. They could be activated by a drive ready timeout. Under normal circumstances the read timeout sends a software reset command to the drive as necessary. If this procedure fails and the drive ready timeout value is reached, the software directs the hardware to send a hardware reset, or to repower the drive. A software reset is least taxing on repower method is most taxing. A software reset minimizes drive activity while reading problem sectors, which reduces additional wear. A hardware reset or the repower method deals with an unresponsive hard drive.

Moreover, because reset methods are under software control via the user-configurable timeouts, the process is faster and there is no need for constant user supervision.

The drive ready timeout can also reduce the chances of drive self-destruction due to head-clicks, which is a major danger in drives with read instability. Head-clicks are essentially a firmware exception in which repeated improper head motion occurs, usually of large amplitude leading to rapid drive self-destruction. Head-clicks render the drive unresponsive and thus the drive ready timeout is reached and the software shuts the drive down, hopefully before damage has occurred. A useful addition to an imaging tool is the ability to detect head-clicks directly, so it can power down the drive immediately without waiting for a timeout, thus virtually eliminating the chances of drive loss.

Read More

Solve Disk Imaging Problems (Part 2)

Disabling Auto-Relocation and SMART Attribute Processing

While the methods outlined in the previous section go a long way to obtaining an image of the data, other problems remain.

When drive firmware identifies a bad sector, it may remap the sector to a reserved area on the disk that is hidden from the user (Figure 3). This remapping is recorded in the drive defects table (G-list). Since the bad sector could not be read, the data residing in the substitute sector in the reserved area is not the original data. It might be null data or some other data in accordance with the vendor-specific firmware policy, or even previously remapped data in the case where the G-list was modified due to corruption.

Moreover, system software is unaware of the remapping process. When the drive is asked to retrieve data from a sector identified as bad, the drive firmware may automatically redirect the request to the alternate sector in the reserved area, without notifying the system before the error is returned. This redirection occurs despite the fact that the bad sector is likely still readable and only contains a small number of bytes with errors.

disk imaging
Figure 3: G-List Remapping
This process performed by drive firmware is known as bad sector auto-relocation. This process can and should be turned off before the imaging process begins. Auto-relocation on a drive with read instability not only obscures instances when non-original data is being read, it is also time-consuming and increases drive wear, possibly leading to increased read instability.

Effective imaging software should be able to turn off auto-relocation so that it can identify problem sectors for itself and take appropriate action, which ensures that the original data is being read.

Unfortunately, the ATA specification does not have a command to turn off auto-relocation. Therefore imaging software should use vendor-specific ATA commands to do this.

A similar problem exists with Self-Monitoring Analysis and Reporting Technology (SMART) attributes. The drive firmware constantly recalculates SMART attributes and this processing creates a large amount of overhead that increases imaging time and the possibility of further drive degradation. Imaging software should be able to disable SMART attribute processing.

Other drive preconfiguration issues exist, but auto-relocation and SMART attributes are among the most important that imaging software should address.

Increasing Transfer Speed with UDMA Mode

Modern computers are equipped with drives and ATA controllers that are capable of the Ultra Direct Memory Access (UDMA) mode of data transfer. With Direct Memory Access (DMA), the processor is freed from the task of data transfers to and from memory. UDMA can be thought of as an advanced DMA mode, and is defined as data transfer occurring on both the rise and fall of the clock pulse, thus doubling the transfer speed compared to ordinary DMA.

Both DMA and UDMA modes are in contrast to the earlier Programmed Input Output (PIO) mode in which the processor must perform the data transfer itself. Faster UDMA modes also require an 80-pin connector, instead of the 40-pin connector required for slower UDMA and DMA.

The advantages are obvious. Not only does UDMA speed up data transfer, but the processor is free to perform other tasks.

While modern system software is capable of using UDMA, imaging software should be able to use this data transfer mode on a hardware-level (bypassing system software) as well. If the source and destination drives are on separate IDE channels, read and write transfers can occur simultaneously, doubling the speed of the imaging process. Also, with the computer processor free, imaged data can be processed on the fly. These two advantages can only be achieved if the imaging software handles DMA/UDMA modes, bypassing system software. Most imaging tools currently available on the market use system software to access the drive and so don’t have these advantages.

Read More