Distinguish between Native and Processed Data

Everlaw supports uploading both processed and native datasets.

The first step of any upload is to select the type 

This article helps you determine the type of upload you should select for your data. Data uploaded correctly is optimized for search and review, whereas data that is not uploaded according to its correct type often needs to be deleted from Everlaw and then re-uploaded properly. 

For additional information on uploading, read articles on Native Uploads and Processed Uploads. For additional help in identifying what type of data you have, please contact our Support team at support@everlaw.com.

Table of contents

Identify native data

Native data are files in their original, or raw, state that have not been processed or produced previously. Typically, these files are collected directly from a custodian's device, like a computer, cloud service, or cell phone. 

Here are some characteristics of native data:

  • Post-collection, native data sets might comprise many different file types that require different applications to open
  • The folders are typically not highly structured and the files are named what they were originally named on the custodian's device, rather than following a consistent file naming convention, such as being named after their Bates number
  • Native data is not Bates-stamped and doesn't come with a load file
  • The data can range from individual files to container files that hold several different files ( e.g. .ZIP, .PST)

Check our Supported Native Data Type article to see if your native file types are supported.

When you upload native data, Everlaw processes it so that you can search through and review the documents. During processing, the native uploader:

  • Assigns each file a unique ID number, called a control number (e.g.: #1.1)
  • Extracts metadata from the file so you can search on it
  • Creates PDF images for all files in a native upload, and placeholder images for file types that don’t image well (like spreadsheets). 
  • Extracts text so you can search the contents of the file. The text is stored as its own view of the document on Everlaw.
  • Stores a copy of the original file, called native file. The native is accessible as its own view of the document on Everlaw.

Identify processed data

Processed data are files that have already been loaded into, and then produced out of, a processing tool, such as Everlaw.

In contrast to native data, processed data typically consists of files named after their Begin Bates number. Processed data usually has a well-defined folder structure: there are separate folders for the images, text, and any natives included in the data set. 

The files are additionally accompanied by a load file. A load file contains identifying information and metadata associated with the documents.  Every page of the image files in processed datasets may be stamped with their unique Bates numbers, typically referred to as a Bates stamp. 

Because processed data has already been through a processing tool and the native processing steps, it does not go through additional native processing upon upload. 

Instead, the processed uploader uses the load file to link the associated metadata and image, text, and native files (if present)  for each document. On Everlaw you can search for and access these files as one document, represented by its unique Begin Bates.

Summary of native and processed uploads 

The native uploader ingests unreviewed, varied data. Each file is processed as an individual document and assigned its own control number so that its unique contents and metadata can be searched for and reviewed. You can learn more in the section on identifying native data.

The processed uploader uses the information from the load file to make sure that image, text, and native (if present) files are organized and accessed on Everlaw as a single document represented by its Begin Bates number. You can learn more in the section on identifying processed data.

Improperly uploading processed data as native data

If processed data is mistakenly uploaded as native data, the files representing one document end up completely separate. This means that each file is processed to generate a new image and text file and assigned a unique control number, rather than being recognized as a unified document on Everlaw.

We can illustrate this using an example data set. In our example, the document with Begin Bates ABC001 is represented by three separate files (1 image, 1 native, 1 text). The metadata and information to link these files together is in the load file (Load file.DAT).

 

When these files are properly loaded as processed data, the separate image (PDF), text, and native files are accessed as a single document. You can toggle between these format views in the document review window.

When these files are improperly loaded as native data, they are processed as separate documents, assigned separate Control numbers, and accessed individually from a results table.

Some of the consequences of mistakenly uploading processed data with a load file as native include:

  • The metadata in your load file is not connected to the documents, meaning you can't search based on key aspects of your data
  • Rather than being identified by their Begin Bates, your documents are assigned arbitrary new control numbers that the producing party didn't intend
  • Since each file is processed separately, the total size of your upload is larger, which has billing implications.
  • Documents intended as standalone documents may have attachments extracted from them as additional documents. Conversely, documents intended as attachments will not be correctly attached to their parent as per the load file.

Improperly uploading native data as processed data

You can't usually load native datasets as a processed upload with a load file at all. This is because the uploader will prompt for a specifically structured load file, which native datasets do not have.

Everlaw supports uploading processed PDFs without a load file. Be sure that if you use this tool to upload your data, the PDFs are processed PDFs and not original, native PDFs. If you improperly upload native PDFs as processed PDFs, some of the consequences may include:

  • Files are assigned Bates numbers with the prefix EVER rather than Control numbers
  • Metadata extraction is limited when native PDFs are uploaded as processed PDFs
  • No native file is stored: there is no native view when reviewing the document and you cannot export the native PDF from Everlaw

Dealing with ambiguous data sets

Not all data sets are clearly distinguished as native data or processed data. You may find that you have a mix of both types, or that processed files (such as PDFs) may not be named after their Bates numbers. In the case that  you are unsure how to upload your data, please reach out to Support@everlaw.com  to help you identify your data.

Have more questions? Submit a request

0 Comments

Article is closed for comments.