What is Data Carving?

Data Carving is a technique used in the field of  Computer Forensics when data can not be identified or extracted from media by “normal” means due to the fact that the desired data no longer has file system allocation information available to identify the sectors or clusters that belong to the file or data.

Currently the most popular method of Data Carving involves the search through raw data for the file signature(s) of the file types you wish to find and carve out.  Since the file system has no information on the size of the file being carved, the current methods involve specifying a block size of data to “carve” upon finding the desired signature.

This current method relies on some assumptions:

1) that the beginning of the file, which is where the signature resides, is still present;

2) the signature you are searching for is not so common that you would find the string of characters in many other files, thereby creating many “false hits”; and

3) that the files identified through the signature search are contiguous and not fragmented.

In addition to the issue listed in the previous paragraph, the current Data Carving methods also rely on the user making adjustments to the “block size” they are carving out for a specific fill signature.

As files are identified through a search, the files are typically manually reviewed by opening in a program capable of viewing the specified file type.  This manual review gives the examiner an idea if they need to “carve” a larger or smaller block of data for a given file in order to carve the file in its entirety.

This current process is not optimal, as it relies on guess work and a lot of trial and error on the part of the forensic examiner.

In this paper, submitted for the 2006 DFRWS Data Carving Challenge, I will look at the process of Advanced (Smart) Data Carving, which removes the “guess work” when carving certain compound file formats that contain information about the size and layout of the file in question,  regardless of the existence of file system allocation information for the file.

The below documents, detailing the various file format specifications, were used to manually carve all files, listed on pages 1-2 of this submission, from the file “dfrws-2006-challenge.raw.”

X-Ways Forensics which is used to manually carve and hash all files.

http://www.x-ways.net/forensics/index-m.html

Office Document File Format Specification

http://sc.openoffice.org/compdocfileformat.pdf

Exif/Jpg File Format Specification

http://www.media.mit.edu/pia/Research/deepview/exif.html

Zip File Format Specification

http://www.pkware.com/business_and_developers/developer/popups/appnote.txt