JPEG Files
Next we will look at carving JPEG graphic files, as specified in the document “Description of Exif file format.” For complete details of the file format specification, please refer to the hyperlink to the document, listed on page 1 of this paper.
The JPEG graphic file starts with a Start of Image (SOI) signature of “FF D8”. Following the SOI are a series of “Marker” blocks of data used for file information. Each of these “Markers” begin with a signature “FF XX”, where “XX” identifies the type of marker. The 2 bytes following each marker header is the size of the marker data. The marker data immediately follows the size and then the next marker header “FF XX” immediately follows the previous marker data. There is no standard as to how many markers will exist, but following the markers, the signature “FF DA” indicates the “Start of Stream” marker. The SOS marker is followed by a 2-byte value of the size of the SOS data and is immediately followed by the Image stream that makes up the graphic. The end of the image stream is marked by the signature “FF D9”.
In the event that a thumbnail graphic exists within the file, the thumbnail graphic will have the exact same components as the full-size graphic, with “FF D8” indicating the start of the thumbnail and “FF D9”, indicating the end of the thumbnail. Since thumbnails are significantly smaller and less likely to experience fragmentation than their larger parent full-size graphic, they can be used as a comparison tool for evaluating what the entire jpeg graphic is supposed to look like, in the event you must do a manual visual review of the carved graphic.
By searching first for all locations of the “FF D8 FF” signature, you identify the beginning of each jpeg graphic. The reason for searching for “FF D8 FF” is that there are different versions of jpeg files, some that start with “FF D8 FF E0” and some with “FF D8 FF E1”, and leaving off the 4th byte in your signature will catch all instances, but may result in some false hits.
Rather than carve a specific length of data, in this case we will start at the beginning signatureand carve until we find “FF D9”. In the event of a non-fragmented jpeg graphic, without a thumbnail, this will carve the whole file. If we slightly modify our logic, by including a “if “FF D8” occurs again before “FF D9”, then carve to the 2nd instance of “FF D9″” statement in our search for jpegs, then we will carve entire files including their thumbnail as long as they are not fragmented. Without this “if” logic, the first search would stop carving at the end of the thumbnail and result in an invalid jpeg. In the event of a fragmented jpeg file, the above carving method results in either a partial jpeg file or a complete jpeg file that contains extraneous data in the middle of it.
After carving all jpeg files based on these rules, we next quickly review which carved jpeg files are complete, versus which ones are fragmented and need further analysis. By carving all jpeg files to a folder, you next add that folder to your forensic tool that has partial graphic file viewing capabilities, such as the “Outside In” viewer that is built into many existing forensic tools. Using a gallery view, you can quickly identify which files are not displaying properly, only showing a partial file, and require further analysis.
Once all fragmented or partial jpegs are identified, manual visual inspection of each of these files was used to determine at what point the fragmentation occurred. This was done by approximating the percentage of the file that displayed correctly in the viewer before displaying corruptly. The raw data of the carved file was then reviewed at the data at that percentage of the file to attempt to identify where the valid graphic data ended. For this process it was assumed that the extraneous data started at an offset that was a multiple of 512-bytes from the beginning of the file. Once the extraneous data was identified, it was then removed from the partial jpeg and re-evaluated as possible sector data for other fragmented files that had previously been identified