To read more about uploading documents on Everlaw, please refer to the articles in our Uploads section.
Table of Contents
Everlaw has a cloud-processing system that automatically processes and ingests native files into the platform. Documents processed on Everlaw will go through de-NISTing, deduping, OCR, AV transcription, and language detection, as appropriate. Additionally, Everlaw will generate text, metadata, and PDFs (if requested) for your native data.
A native upload can contain one file, or multiple files. For more information about preparing your files for upload, please see this article.
Creating a native data upload
- Click the Data Transfer icon of the navigation bar
- Select “Uploads” from the dropdown menu
- Select “+ New Upload” on the left-hand sidebar
- Select "Native" as your type of upload
- Click "Start upload"
Here, you can add the data you want to upload. Drag and drop files into the box, or click "Browse" to select files from anywhere on your computer. Please see this article for information on how to prepare your files.Return to table of contents
Uploading via cloud-based apps
At the bottom of the screen, you will see options to upload via popular cloud based storage apps. You can upload directly from the following apps:
- Google Drive
- Google Vault
- You can also upload exported Google Vault files
You can also collect from the following cloud applications:
Once you select an app, you will be asked to log in via a separate dialog box. If you do not see this dialog box, check your pop-up settings. After logging in, you will be able to select your files directly from the app's platform. Once you hit "enter" or "submit" within that app, you will be taken to the next step in the upload process.
Uploading via direct link
Direct links are urls that point straight to a file without any password protection. Essentially, if you paste a url into your browser, and your browser starts downloading a file instead of loading a webpage, you have a direct link. You can convert a Google Drive link into a direct link by amending it:
- Google Drive link: https://drive.google.com/file/d/FILE_ID (FILE_ID is the hash automatically inserted to link you to the document)
- Amended direct link version: https://drive.google.com/uc?export=download&id=FILE_ID
Once you’ve made your selection, a wizard will appear where you can specify settings for the upload:
Step 1: Dataset details
In this step, you can specify the configuration for your upload. By default, your upload settings will be inherited from your last upload.
Name: You are required to give the dataset a unique name. The name (and date of upload) will appear as the name of the associated uploads card on the homepage, which you can rename later if you have upload permissions.
Deduplication: Upon upload, you have the following options.
- Global: Deduplicate against all of the existing documents in your database.
- By custodian: Deduplicate against all of the existing documents by custodian.
- This means that if two duplicates have different custodians, they will both be uploaded. Conversely, if a document has the same custodian as another duplicate document that already exists on the platform, the duplicate file will not be uploaded.
- None: No deduplication.
Even if you choose to deduplicate globally, Everlaw will preserve a record of the deduplicated document in the All Custodians and All Paths fields that are populated for the existing document on the database. This means that if a document with custodian Sam is deduplicated against a document with custodian Jenny, the existing document on the database will now list both Sam and Jenny in its All Custodians metadata.
To learn more about the definition of duplicates and how upload deduplication handles documents (and families), visit this help article on duplicates. Please note that Google files (e.g., Google Documents, Google Sheets) will not get deduplicated in the same way as other file types, because they undergo their own conversion process within Google.
From Legal Holds: Users with both upload and legal hold permissions will now see a “From Legal Holds” section on the Details page when running a native upload.
By default, documents will not be connected to a legal hold. If the documents in the native upload should be associated with a legal hold in the database, click Yes here, then select that legal hold from the dropdown. Once you’ve selected the legal hold, continue to the Custodians step of the native upload wizard. On the Custodians step of the native upload wizard, the custodian dropdown will automatically include all custodians on that legal hold under the header “From legal hold”. To learn more about legal holds, visit the legal hold documentation center.
Advanced Settings (create PDF's, timezone, OCR language, and email image attachment) can be viewed and edited by clicking the caret icon next to Advanced Settings, which are collapsed by default.
Create PDFs: By default, Everlaw creates PDF images for all files in an upload, and placeholder images for file types that don’t image well (like spreadsheets). If desired, you can choose to image the file types that don’t image well, or choose to not image any of the files in an upload.
In addition to Excel files, the following file types will also not be imaged by default:
- LibreOffice Calc
- Empty files
- Container files
- .txt files that are greater than 1 MB
Default Timezone: The selected timezone is assumed for any datetime metadata lacking an explicit timezone. In addition, email headers printed at the top of PDF images generated during processing show datetimes in the selected timezone. Choosing UTC will leave PDF email header metadata timezones unchanged.
OCR Language: This step allows you to specify particular languages for Optical Character Recognition (OCR). OCR will be automatically run on TIFFs and PDF pages with little or no extractable text. By default, OCR language detection is set to Autodetect. Autodetect can extract all Latin-alphabet languages (such as French and German) as well as Chinese, Japanese, and Korean (CJK). If your document has a combination of these aforementioned languages, Autodetect will also be able to OCR them automatically as long as there is only one language per page. Autodetect will not reliably OCR multiple languages within one page.
You can also select a single language to target for OCR. In this mode, OCR will only detect that language and English. There are two scenarios where you would want to select something other than Autodetect:
- If all your documents are not in a Latin-alphabet language or CJK (e.g. Russian, Greek)
- In this case, you must select that language from the dropdown menu in order for OCR to work for that document.
- If the quality of your scanned document(s) is low, and you know there is only one language in the document (in addition to English)
- This will improve the quality of OCR, but only for that language and English. It will prevent the detection of any other languages in that document, so you should be sure that the document only has only one non-English language before selecting that option.
If your upload includes multiple documents, each with different foreign languages, then you’ll want to select Autodetect. However, this means that non-Latin, non-CJK language documents will not get properly OCRed. For example, if one document is entirely in Arabic (non-latin, non-CJK language), and another is in French (Latin language), then only the French document will be properly OCRed. In this situation, you can separate those documents into different uploads so that you can select the appropriate OCR language setting for each, or, after processing, select specific subsets of documents from the results page for reprocessing with a different OCR language.
You can also use the OCR language field to specify transcription of Spanish files with extractable audio. Select Spanish from the OCR language dropdown. You cannot transcribe Spanish files and OCR other documents with non-English languages in one upload. You can always reprocess Spanish media files and transcribe them in Spanish once they’re uploaded.
Page Size: Everlaw will generate PDFs in the selected size for documents that do not have a described size (e.g. emails). Documents with an explicit size (e.g. PDFs, word documents, and images) will remain in their original sizes. Documents exported, printed or produced from Everlaw will respect the size of the pages on the platform.
Email image attachments: There are three options for deciding whether image attachments should be displayed inline, or treated as separate attachments. If you would like every image in the email to be displayed inline within the PDF, you can choose “Inline all images found in emails." If you would like email image attachments to be extracted as children of the parent email, select “Extract all images found in emails.” Finally, if you choose smart determination, then Everlaw will dynamically determine which images are likely to be attachments, and which ones are (or are intended to be) inlined images (e.g., signature icons). Factors influencing this smart determination include an image's dimensions, overall size, and content ID.
Decryption Keys: Everlaw can store private keys used to decrypt S/MIME encrypted emails. To learn more, read this article. You can follow this link to manage your decryption keys on a new tab.
Passwords: Inaccessible files will not be processed. If any of your files/folders are password-protected, input the password(s) into the password box (one password per line) to enable Everlaw to image and extract text based on your processing options. The native view will not be available for password-protected documents on Everlaw.
Step 2: Select custodians
The custodians step allows you to specify what custodian value to associate with the documents you’re uploading. You can specify a default custodian for all documents in an upload and/or set custom custodian values for particular files or folders. If your data belongs to multiple custodians, please read this article to learn how to prepare your data accordingly before uploading.
If you selected a legal hold in the previous step, you will see custodians from that legal hold in the custodian dropdown under the header “Legal hold”. Select the appropriate custodian from the legal hold to link the documents in the upload to the custodian from the hold. Users can select different custodians for each subfolder.
To set a default custodian, input the custodian name into the “default custodian” box at the top of the table. If your project already has custodians from previous uploads, you can also select one from the dropdown list.
To set custom custodian values for particular files or folders, find the file/folder on the table and input the custodian name into the custodian box on the right. If there is a default custodian, it’ll be overridden for that particular file/folder. Files that have a black caret symbol in the far left can be expanded to display the individual sub-folders/files they contain. Click on the caret icon to expand or collapse.
Step 3: Uploading into partial projects
Aside from uploading the documents into the current project you’re on, you can also add the documents to any partial project you have the Partial Project Document Management permission on. No matter what, documents you upload will automatically be added to all complete projects in the database (i.e., projects that contain all documents in the database). To select or deselect a project, click on the checkbox. You cannot deselect complete projects.
Once you click ‘Upload’, your data will begin transferring to our servers. The upload details overlay will appear on the transferring tab showing you the status of the transfer. From the overlay, you can add additional documents to the upload by clicking on the “+ Add files” button.
A status card will be added to the native data page corresponding to your upload. A time estimate will appear on this status card to indicate approximately how long it will be until processing is complete. As your upload progresses, you can start reviewing completed docs. You do not need to stay on this page for the upload to continue processing. Once all your files are successfully processed, you will see a document icon with a green checkmark in the status card. Clicking the icon will take you to a results table of your processed documents.
Native uploads will each be assigned a control number, indicated by a # prefix.
To learn how to view upload status, delete, rename, and take other actions, please see the “Managing native uploads” section.
Managing native uploads
Uploads will appear as cards in the “Native Data” section of the Uploads page.
You can take the following actions on an upload from its card:
1. Rename the upload and/or add a description: Click the upload name and enter a new name. You can add a text description by clicking "Add a description..." Both of these changes will affect the upload across projects in the database.
2. View uploaded documents: Click the document count (to the right of the document icon) to open the uploaded documents in the results table. You can also access your uploaded documents from the homepage under the Document Sets column.
3. View upload information and errors: Click "View Report" to see the files currently being uploaded or processed, information about deduplicated and deNISTed files, as well as other information related to the upload (e.g., upload errors and issues). See the "Upload Report" section below for more information.
More options (accessible via the three-dot menu in the top right corner of the upload card):
- View upload details: View the progress of your native files' transfer to Everlaw's servers, download or add source files, and access a results table of documents from each source file individually. See section "Upload details: Transferring tab" below for more information about adding additional files to your upload.
- View configuration: View the timezone propagated to your documents and projects the data was uploaded to, as well as the settings for creating PDFs, OCR language, inline images in emails and Page size.
- Delete: The documents in the upload, including all files generated during processing (e.g., image, text), will be removed from all projects in the database and the database itself. This option is only available to users with the Delete permission.
View the details of your upload
After starting an upload, you can open upload details to see more information about a particular upload card. The date an upload started can be seen under the Name and Description on the upload card. Hovering over the date will display the exact time it was initiated based on your device's timezone.
There are three tabs on the upload details dialog, the transferring, processing and report tabs. In these tabs, you can manage your source files, look at the files that are currently being processed, and view a report of the files uploaded. To view the upload details, click View Report at the bottom of the upload card.
Upload details: Transferring tab
For organizational purposes, it can be helpful to add additional files to an upload after the initial upload is complete. For example, you may want to keep all files from a single custodian in the same upload. If so, you can add additional files to the custodian's upload as they become available to you. To do so, open the upload details dialog by either clicking on “View report” or by clicking the three-dot menu in the top right corner of the upload card and choosing "View upload details."
From here, click on the Transferring tab in the top left corner. Next, click "Add files" in the top right corner. Then, choose where the new files should come from (e.g., local, cloud-based app).
You will then be asked to enter any passwords for the new files, and to associate the files with a custodian. The other configuration settings (e.g., deduplication, timezone) will be pulled from the initial upload, unless you are uploading from cloud applications. If you choose to upload additional files from a cloud application, you cannot dedupe or modify your custodian names like you can from the native uploads page.
Upload details: Processing tab
Files that are currently being processed are shown in the processing tab. This tab shows the files that are currently being processed for the native upload card. The tab also shows the number of files that have completed processing, the number of files queued for processing, and the number of files currently being processed by Everlaw. The table of currently processing documents displays the document control number, processing status, size and file name and path for documents that are currently being processed.
This view updates every 5s and is sorted in descending order of time since processing started on the files (files that take the longest to process are at the top), and provides additional information on additional processing steps on the files that are currently processing. If there are more than 100 files being processed at the same time, the table shows the details of the longest running 100 files.
The possible processing steps and file types that have these steps are as follows:
|Processing step||File types|
|OCR||PDFs, TIFF files and other image files|
|Transcribing/Transcoding||Audio/video files, any file with audio|
Upload details: Upload report tab
Clicking on the upload report tab will open a visualization of your upload.
At the top of the report, you can see the number and size of documents that were deduplicated or deNISTed. To download a report of the documents that were deduplicated upon upload, click "Download deduped info." The results of the csv will look like the below:
The three columns include the following information:
- Original Path: Native path for the "original" document (for each set of duplicates, the single instance of the document that was uploaded to Everlaw)
- Original Bates: Begin Bates numbers for the original documents
- Duplicate Path: Native path for the deduplicated document associated with the original document
If there are multiple duplicates for one original path, there will be multiple rows with the same original path (and same original Bates) included in the CSV.
The pie chart on the right of the report shows a breakdown of the file types that have been processed and uploaded. You can also see the absolute number of documents for each file type using the list on the left and easily create a search for a particular document type present in the upload by clicking on the blue document counts. The “Documents with issues'' and “Flags” sections show the number of documents that registered errors during processing, broken down by processing stage. After files are examined, converted to PDF, and OCR’d to create searchable text files, native files will be scanned for viruses or other malware. Any number in blue is clickable, and will bring you to a results table with the set of documents which yielded errors (or were flagged as malicious). For more information, please see this article on identifying and troubleshooting native upload errors.
You are able to download a detailed processing report for all files uploaded onto the platform. You can download the report for a single upload card by clicking on the “Download detailed report” button on the upload report tab of upload details. To download the report for multiple cards, filter native upload cards by name/date filters and click “Download detailed report” on the Native Data Uploads page to download the processing report for all currently visible upload cards that have completed processing. Uploads that are still in progress or are being reprocessed are not included in the report.
When you click the download button, a task is created (similar to exports) and the processing report is generated. You can monitor the progress and download the report from the homepage in the Batches & Exports column as well as download the generated CSV from the homepage card or toast notification when the report is ready. The maximum size of the report is capped at 10,000,000 rows or 10GB.
The CSV file includes an entry for EVERY file in the upload, including those that are deduplicated or de-nisted on upload.
The CSV report contains the following fields for each file:
- Document id (Control #) - This is blank for Deduplicated or DeNISTed files
- Upload dataset name - This can be changed by renaming the upload card
- Upload dataset id - This is fixed regardless of changes to upload dataset name
- Upload date (UTC)
- File path
- Custodian - Blank if no custodian was assigned at upload
- Processing/Production Flags - the current contents of the “Processing/Production Flags” Special Column, if any. (This is any of the flags available in the Uploaded document search term’s “flags” parameter.)
- Processing error - true/false
- OCR - true/false
- Reprocess date - The most recent reprocessing date and time if any
- Reprocessed by - First and Last Name of user
- Deduplicated - true/false
- DeNISTed - true/false
- Uploaded by - First and Last Name of user
⚠️ Please note: Upload time and user tracking was added on the release June 3, 2022. Before this date, the fields Reprocess date, Reprocessed by, Upload date, Uploaded by are always blank for documents uploaded prior to June 3 as Everlaw did not previously store information for these fields. New sources added to these uploads will correctly show the uploading/reprocessing user and time in the report CSV.
Additionally, for legacy users without First/Last Names, their email or username will appear in the Uploaded by or Reprocessed by field instead.
Native data processing settings
- The orientation of documents is preserved from its native version (e.g., a document that is in landscape orientation will remain that way upon upload)
- Embedded files: Everlaw will extract all embedded files, including audio and video files, in a Microsoft 365 file (e.g., an Excel file embedded in a PowerPoint) and any file embedded in a PDF.
- Emails containing URL links to other supported documents will be recognized. See the context panel article to learn more.
- The children of container files are extracted with no limitation on depth. For example, a Word document attached to an email that’s attached to another email that’s in a Zip file that’s in another Zip file will be extracted.
- Hidden columns in Excel are displayed.
- Notes are extracted and presented in the PDF/Image view for Word documents and the Native view for spreadsheets.
- Sometimes, you may try to upload an entire hard drive or a folder with personal files mixed in with system/software files. Some of these files have no user-specific content and can be removed upon processing. This process is called deNIST (removing NIST files). Any files that are on the NIST list will qualify for deNISTing automatically upon upload. Binary files, and virtually all containers, are not part of this list and will not be removed.