Processing Electronically-Stored Information for E-Discovery – Podcast Series 1

Last week Edwards Law kicked off a series of blogs and podcasts on electronic discovery.  In our first blog we provided the basics–the definition of electronically-stored information or ESI, and the unique issues with culling and producing electronic discovery.  We also brushed on the importance of having a records management program in place to properly manage ESI (electronically-stored information).  Today, on our podcast we will dive into the details of processing electronically-stored information.    The processing phase of e-discovery is highly technical, and often considered the exclusive domain of litigation technologists–but, it shouldn’t be.

E-discovery processing decisions can have a huge impact for your attorney’s review, production and presentation options later in your case.  So, you and your attorney must understand the key aspects of processing this information and adopting a thoughtful approach to processing workflow.  Let’s go through the basics.

Processing refers to the methods council and litigation technologists use to prepare, collected or produced electronically-stored information or ESI for attorney review and analysis.  Any native ESI can be processed, regardless of whether your attorney receives the ESI from you, an opposing party in response to requests for production or a third party in response to a subpoena.

Your attorney must make informed decisions at the processing stage of discovery because these decisions have important downstream effects.  In particular, council should understand the technical basics of common processing methods; whether and to what extent ESI should be processed; how to best work with opposing counsel and other parties on processing issues; and logistical elements of a processing workflow.  Today, we will cover the first two points on that list, and we will discuss the other two points in our next podcast.

Understanding ESI processing functions.  Processing native ESI generally involves extracting metadata from the ESI and generating or extracting searchable texts from ESI.  These core processing elements enable your attorney to eliminate duplicates and irrelevant ESI from the universe of ESI that opposing counsel can review in a document review platform.  Also, your attorney can use processing to narrow and efficiently navigate the ESI that remains in the universe of documents.

Extracting Metadata.  Metadata refers to information about a document that is not visible on its face.  That is, you can’t see it from a hard copy of the document. All ESI has metadata, but the types of metadata available differ by ESI type.  For example, an email has “Email To”, “Email From”, and “Date Sent” metadata, and the email system automatically populates those metadata fields when the sender creates and sends an email.  In contrast, Microsoft Excel, for example, populates different metadata fields including “Date Created” and “Date Last Modified”.  Most, if not all, ESI processing tools extract the metadata from native ESI.  When processing the native ESI, your attorney can use the processing tool to extract the metadata that your attorney finds relevant to the case or helpful to the review process.  If your attorney intends to produce select metadata as part of his or her document productions, he or she must make sure they extract and preserve those metadata fields during processing.  Once the processing tool extracts metadata, your attorney can use this metadata to filter the ESI.  Filtering with processing tools permits counsel to narrow down the scope of ESI that they ultimately upload to a document review platform for review.

Your attorney can also export the metadata into a load file.  The processing tool can organize the extracted metadata into a load file.  Your attorney can then export the load file from the processing tool and upload it to a document review platform.  Once your attorney uploads both the ESI and the load file to the document review platform, they can sort or search the ESI by metadata within that platform.

Building a searchable index.  Most processing tools extract or generate texts from ESI and use the collected text to build a comprehensive index that enables your attorney to search multiple documents simultaneously.  Specifically, processing tools built a searchable index by extracting text and using optical character recognition, or OCR.  Some file types such as Microsoft Word and Microsoft Excel contain text that processing tools can extract and save.  The processing tool saves the text from any given file as a document, specific, document level text file.

Some type file such as TIF images and unreadable PDFs, do not contain extractable texts.  To search the text of unreadable files, your attorney must use the ESI processing tool or a separate OCR tool to read the face of a document to determine its content and save that content as text in a document level text file.

Some processing tools permit your attorney to search the ESI for key terms or phases while the ESA is housed in the processing tool.  In these instances, your attorney can use the extracted and OCR’d text to filter the ESI.  In addition to using the extracted and OCR text to filter within the processing tool, your attorney can export a document level text file from the processing tool and upload them to the document review platform.  This makes sure the document review platform can access and search the collective content of all ESI in the review set, when you attorney searches within the platform.

Now, before you talk with your attorney about implementing this process and whether or not to spend the money on it, you should first decide with your attorney whether or not it makes sense to spend the money and even go to the ESI processing option.

For example, you don’t need to process any ESI for which your company has corresponding, document level text files and load files because that ESI already has been processed and the extracted text and metadata have been provided.  In this situation, your attorney can simply export the ESI text files and load files to a document review platform and begin their review based on searchable text and metadata already in their possession.  However, if this is not an option and there’s ESI that is not accompanied by text and load files, your attorney should understand the nature of the ESI that your company does house and whether processing tools are effective for that particular type of ESI.  The extent to which your attorney can leverage the benefits of processing in light of the document review method is critical.

Native versus non-native ESI.  This is an important consideration in determining whether or not to process ESI.  Generally, non-native ESI does not exist in its original format.  Generally, you should process only native ESI.

Non-native ESI does not exist in its original native form, but instead is converted to another file format such as when a Microsoft Word document is saved as a PDF. When council or document custodians convert native files to non-native formats, the metadata from the native files often is lost, so council should not process a non-native file to extract or leverage metadata because no native metadata can be extracted from that file.  In other words, don’t have your attorney convert Word documents into PDF during the process.

If you process the native Word file, for example, the metadata extracted from the processing tool would indicate that the custodian created that file on the correct date.  Based on this metadata and the relevant date range, counselor would properly include the document in their review set.  However, if a custodian converts the Microsoft Word File into a PDF, the metadata extracted by the processing tool would indicate that the custodian created the file on the conversion date, rather than the date the custodian created document.  Based on this metadata and the relevant date rage, your attorney could improperly exclude that document from a review set.  So this is an important consideration in determining whether or not to process ESI, determining whether or not you have documents that warrant such a process.

Another important consideration on whether or not to use ESI processing is metadata preservation.  Do the documents you have contain metadata?  Not all native files contain original metadata.  For example, a Microsoft Excel file can retain its native format, that is, an Excel file, but contain altered metadata.  If your attorney and custodians fail to take the proper precautions for preserving and collecting the file, metadata can be altered without changing the file format when counsel or your custodians of the documents modify and re-save a file, or open and save a file, even if the user has made no changes.

Your attorney and your custodians of the documents also might create copies of files either directly or by attaching a file to an email and expect the copy to contain the same data as the original file.  Copies of files, however, generally do not contain all the original file’s metadata.  A processing tool cannot extract metadata that no longer exists.  If your attorney or the custodians alter a file’s metadata in a manner that alters the case, for example, by receiving a file a manner that changes its last modified date, your attorney may lose the ability to use the metadata to filter, search or otherwise organized our production. This is an important consideration for you and your attorney when the determining whether or not to use ESI processing tools.

Review method.  This is also an important consideration in determining whether or not to use an ESI processing tool.  The value of ESI processing depends on you and your attorney’s plans for reviewing the ESI.  Council should process ESI if they intend to structure or prioritize their document review based on the ESI’s text or metadata.  Unless council processes the ESI or unless the ESI is accompanied by a metadata load file, council cannot extract, access or search data or metadata in a document review platform.  If your counsel intends to review each collected document, which is called a linear view, there are even fewer benefits reaped from using the ESI processing system.  Council conducting a linear review might review documents within a document review platform or just open each file in their native application for review.

This is an effective way to review documents, although the time consuming, and it’s an option that your attorney might consider if they either performed a focused collection and reasonably expect that all of the collected ESI is irrelevant and they lack access to document review tools, and they have no means to sort, cluster, or search the ESI by metadata. When your attorney decides to review every document, there is no need to rely on metadata searches or clustering to identify a subset of the ESI.  Instead, a linear review will offer you and your attorney the assurance that they will find and review all relevant documents regardless of whether the metadata or the manner in which they are grouped is applicable.

In some circumstances, council conducting a linear review can still leverage processing tools to narrow the set of the ESI included in the review.  For example, before beginning the linear review, council could filter the ESI by date range or de-Nist, the ESI, which involves identifying and removing software application system files from collected ESI to isolate a more focused review set.

Thank you for listening.  Next week we will post a podcast dealing with the issues of working with opposing counsel and processing ESI, organizing the processing logistics, and also touching some more on the importance of having a records management program in place.  Thank you for listening.

No Comments

Post A Comment