In a recent decision by the court in Keaton v. Hannum (S.D. Ind. Apr. 29, 2013), the court determined that it was unreasonable for the defendant to refuse to produce Gmail emails in native format, because she had previously produced emails in what the court called "a 'native' file for Gmail emails."
The court's use of quotation marks around the word "native," in this context, indicates that they refer to a file that is produced in its original format and usually retains all related metadata (data about data, including when it was created, by whom, how, and so forth). Thus, a "native" spreadsheet produced in Microsoft Excel would be produced in the .xls or. xlsx format, not as a TIFF or PDF file. A TIFF is essentially a screen shot of the image of the file, as it would appear on a computer monitor -- akin to scanning an old hard-copy photograph, eliminating any notes or details written on the back of the original photo. When you look at the electronic copy of the photo, you only have the image, without anything to remind you who the people are in the picture or where and when it was taken other than the date and time on which the image was scanned into the computer. A PDF file is an evolved version of the TIFF file format. While both TIFF and PDF allow viewers to search the files for keywords, this requires an additional step of running the file through an optical character recognition ("OCR") program. Further, TIFF and PDF files strip out the underlying metadata, information that oftentimes is critical to the requesting party. For example, when viewing a Word document on your computer, you can determine what user created the document, when it was created, when it was last edited, and sometimes how it was edited. Just like the scanned photograph, a PDF of the same document will give you no such information.
In a case reliant on the evidence uncovered in email, it is not only important to know what was said between parties, but when they said it, and from what devices it was sent. Metadata paints a clearer picture of the situation and often aids a party in arguing the basis of a client's motivations. In an investigation for money laundering, the prosecution would need to develop a clear timeline of events, which is only possible with email headers and metadata giving information as to who sent what data or information to what recipient, and in which order.
While it is certainly possible to go on at great length about the meaning of a word such as "native" and its many possibilities within the legal profession, in this case, it is somewhat unimportant. The greater understanding to arise from this case is that courts have a duty to balance interests of cost, time, and fairness. It is crucially important for investigators to see the whole picture, metadata and all, but it is also crucial that they fully understand the cost ramifications of searches for files and the implications of requesting a certain type of file. With each new technology or software application that becomes evidence, the court must determine the best solution to balance these three interests.
In this case, the court desired a particular end result (production of defendant's Gmail emails with basic metadata in place) and determined an acceptable format for the production of that data (download the messages to Outlook), as the defendant had previously produced Gmail emails in a PST file (a term used to refer to "Personal Storage Table" files, generally emails), and the court found this was a reasonable method for producing further emails.
The court could have taken many directions with that case. The simplest solution, which is the one the court chose, was to repeat prior actions that satisfied the necessary production requests. The court could also have required the defendant to purchase software to extract the relevant emails. But the installation and operation of this software could have required the defense to hire an outside expert or vendor who specializes in email extraction. Further, the court could have allowed the defendant to produce the emails in another format, such as PDF or TIFF, or even as hard copies. This may have been an acceptable format if the printouts included the mail headers, but it is easier to alter data in that format.
New technology requires flexibility. The court and parties must determine what information they are seeking: is metadata necessary? Parties must be able to document the production process so that a third party would be able to replicate it if authenticity comes into question. In this case, the court understood that there are ways to achieve an end result supplying the necessary information to the requesting parties without proving unduly burdensome on the producing party. In this case, uploading the files to Outlook was functionally the same as production through a professional vendor, and was clearly practical for the producing party.
As new technologies and new software continue to emerge on the market, there will be hurdles to creating exact replications of the necessary underlying data for production in court. Creating tools to extract unique file types and tools to view those files in a readable format will become the weapons in the e-discovery vendor's arsenal. Whether Instagram, YouTube, Twitter, or another new and popular web-application, sooner or later information on those services will make a court appearance, and making the evidence digestible to the audience is of critical importance.
Already, courts have ordered production of pages and accounts from Facebook and MySpace social networking sites. In Romano v. Steelcase Inc., 907 N.Y.S.2d 650 (2010), the defendant requested production of the plaintiff's Facebook and MySpace accounts and the court ordered that they be produced. The Romano plaintiff claimed that her injuries had prohibited her from enjoying certain activities, but the defendant contended that the plaintiff had posted pictures on social networking sites of herself enjoying these activities after the alleged injury. While the court did not discuss the format for production, the issues here parallel the issues of email production; it is not just what was said or done that matters (the email or the photo themselves), but also the metadata (when was the information in question originally posted).
Just as it has been an uphill battle to explain physical forensic information to judges and juries, it will not be an easy task to effectively produce evidence in a trial if it is not visually digestible to a fact finder. Counsel and the court will have to determine what reproduction methodology is acceptable for these new technologies that balances completeness of information, cost, and fairness.