To read more about uploading documents on Everlaw, please refer to the articles in our Uploads section.
Table of Contents
- Cross-upload reports
Everlaw has a cloud-processing system that automatically processes and ingests native files into the platform. Documents processed on Everlaw will go through de-NISTing, deduping, OCR, AV transcription, and language detection, as appropriate. Additionally, Everlaw will generate text, metadata, and PDFs (if requested) for your native data.
A native upload can contain one file, or multiple files. For more information about preparing your files for upload, please see this article.
Creating a native data upload
- Click the Data Transfer icon of the navigation bar
- Select “Uploads” from the dropdown menu
- Select “+ New Upload” on the left-hand sidebar
- Select "Native" as your type of upload
- Click "Start upload"
Here, you can add the data you want to upload. Drag and drop files into the box, or click "Browse" to select files from anywhere on your computer. Please see this article for information on how to prepare your files.Return to table of contents
Uploading via cloud-based apps
At the bottom of the screen, you will see options to upload via popular cloud based storage apps. You can upload directly from the following apps:
- Google Drive
- Google Vault
- You can also upload exported Google Vault files
You can also collect from the following cloud applications:
Once you select an app, you will be asked to log in via a separate dialog box. If you do not see this dialog box, check your pop-up settings. After logging in, you will be able to select your files directly from the app's platform. Once you hit "enter" or "submit" within that app, you will be taken to the next step in the upload process.
Uploading via direct link
Direct links are urls that point straight to a file without any password protection. Essentially, if you paste a url into your browser, and your browser starts downloading a file instead of loading a webpage, you have a direct link. You can convert a Google Drive link into a direct link by amending it:
- Google Drive link: https://drive.google.com/file/d/FILE_ID (FILE_ID is the hash automatically inserted to link you to the document)
- Amended direct link version: https://drive.google.com/uc?export=download&id=FILE_ID
Once you’ve made your selection, a wizard will appear where you can specify settings for the upload:
Step 1: Dataset details
In this step, you can specify the configuration for your upload. By default, your upload settings will be inherited from your last upload.
Name: You are required to give the dataset a unique name. The name (and date of upload) will appear as the name of the associated uploads card on the homepage, which you can rename later if you have upload permissions.
Deduplication: Upon upload, you have the following options.
- Global: Deduplicate against all of the existing documents in your database.
- By custodian: Deduplicate against all of the existing documents by custodian.
- This means that if two duplicates have different custodians, they will both be uploaded. Conversely, if a document has the same custodian as another duplicate document that already exists on the platform, the duplicate file will not be uploaded.
- None: No deduplication.
Even if you choose to deduplicate globally, Everlaw will preserve a record of the deduplicated document in the All Custodians and All Paths fields that are populated for the existing document on the database. This means that if a document with custodian Sam is deduplicated against a document with custodian Jenny, the existing document on the database will now list both Sam and Jenny in its All Custodians metadata.
To learn more about the definition of duplicates and how upload deduplication handles documents (and families), visit this help article on duplicates. Please note that Google files (e.g., Google Documents, Google Sheets) will not get deduplicated in the same way as other file types, because they undergo their own conversion process within Google.
From Legal Holds: Users with both upload and legal hold permissions will now see a “From Legal Holds” section on the Details page when running a native upload.
By default, documents will not be connected to a legal hold. If the documents in the native upload should be associated with a legal hold in the database, click Yes here, then select that legal hold from the dropdown. Once you’ve selected the legal hold, continue to the Custodians step of the native upload wizard. On the Custodians step of the native upload wizard, the custodian dropdown will automatically include all custodians on that legal hold under the header “From legal hold”. To learn more about legal holds, visit the legal hold documentation center.
Advanced Settings (create PDF's, timezone, OCR language, and email image attachment) can be viewed and edited by clicking the caret icon next to Advanced Settings, which are collapsed by default.
Create PDFs: By default, Everlaw creates PDF images for all files in an upload, and placeholder images for file types that don’t image well (like spreadsheets). If desired, you can choose to image the file types that don’t image well, or choose to not image any of the files in an upload.
In addition to Excel files, the following file types will also not be imaged by default:
- LibreOffice Calc
- Empty files
- Container files
- .txt files that are greater than 1 MB
Default Timezone: The selected timezone is assumed for any datetime metadata lacking an explicit timezone. In addition, email headers printed at the top of PDF images generated during processing show datetimes in the selected timezone. Choosing UTC will leave PDF email header metadata timezones unchanged.
OCR Language: This step allows you to specify particular languages for Optical Character Recognition (OCR). OCR will be automatically run on TIFFs and PDF pages with little or no extractable text. By default, OCR language detection is set to Autodetect. Autodetect can extract all Latin-alphabet languages (such as French and German) as well as Chinese, Japanese, and Korean (CJK). If your document has a combination of these aforementioned languages, Autodetect will also be able to OCR them automatically as long as there is only one language per page. Autodetect will not reliably OCR multiple languages within one page.
You can also select a single language to target for OCR. In this mode, OCR will only detect that language and English. There are two scenarios where you would want to select something other than Autodetect:
- If all your documents are not in a Latin-alphabet language or CJK (e.g. Russian, Greek)
- In this case, you must select that language from the dropdown menu in order for OCR to work for that document.
- If the quality of your scanned document(s) is low, and you know there is only one language in the document (in addition to English)
- This will improve the quality of OCR, but only for that language and English. It will prevent the detection of any other languages in that document, so you should be sure that the document only has only one non-English language before selecting that option.
If your upload includes multiple documents, each with different foreign languages, then you’ll want to select Autodetect. However, this means that non-Latin, non-CJK language documents will not get properly OCRed. For example, if one document is entirely in Arabic (non-latin, non-CJK language), and another is in French (Latin language), then only the French document will be properly OCRed. In this situation, you can separate those documents into different uploads so that you can select the appropriate OCR language setting for each, or, after processing, select specific subsets of documents from the results page for reprocessing with a different OCR language.
You can also use the OCR language field to specify transcription of Spanish files with extractable audio. Select Spanish from the OCR language dropdown. You cannot transcribe Spanish files and OCR other documents with non-English languages in one upload. You can always reprocess Spanish media files and transcribe them in Spanish once they’re uploaded.
Page Size: Everlaw will generate PDFs in the selected size for documents that do not have a described size (e.g. emails). Documents with an explicit size (e.g. PDFs, word documents, and images) will remain in their original sizes. Documents exported, printed or produced from Everlaw will respect the size of the pages on the platform.
Email image attachments: There are three options for deciding whether image attachments should be displayed inline, or treated as separate attachments. If you would like every image in the email to be displayed inline within the PDF, you can choose “Inline all images found in emails." If you would like email image attachments to be extracted as children of the parent email, select “Extract all images found in emails.” Finally, if you choose smart determination, then Everlaw will dynamically determine which images are likely to be attachments, and which ones are (or are intended to be) inlined images (e.g., signature icons). Factors influencing this smart determination include an image's dimensions, overall size, and content ID.
Decryption Keys: Everlaw can store private keys used to decrypt S/MIME encrypted emails. To learn more, read this article. You can follow this link to manage your decryption keys on a new tab.
Passwords: Inaccessible files will not be processed. If any of your files/folders are password-protected, input the password(s) into the password box (one password per line) to enable Everlaw to image and extract text based on your processing options. The native view will not be available for password-protected documents on Everlaw.
Step 2: Select custodians
The custodians step allows you to specify what custodian value to associate with the documents you’re uploading. You can specify a default custodian for all documents in an upload and/or set custom custodian values for particular files or folders. If your data belongs to multiple custodians, please read this article to learn how to prepare your data accordingly before uploading.
If you selected a legal hold in the previous step, you will see custodians from that legal hold in the custodian dropdown under the header “Legal hold”. Select the appropriate custodian from the legal hold to link the documents in the upload to the custodian from the hold. Users can select different custodians for each subfolder.
To set a default custodian, input the custodian name into the “default custodian” box at the top of the table. If your project already has custodians from previous uploads, you can also select one from the dropdown list.
To set custom custodian values for particular files or folders, find the file/folder on the table and input the custodian name into the custodian box on the right. If there is a default custodian, it’ll be overridden for that particular file/folder. Files that have a black caret symbol in the far left can be expanded to display the individual sub-folders/files they contain. Click on the caret icon to expand or collapse.
Step 3: Uploading into partial projects
Aside from uploading the documents into the current project you’re on, you can also add the documents to any partial project you have the Partial Project Document Management permission on. No matter what, documents you upload will automatically be added to all complete projects in the database (i.e., projects that contain all documents in the database). To select or deselect a project, click on the checkbox. You cannot deselect complete projects.
Once you click ‘Upload’, your data will begin transferring to our servers. The upload details overlay will appear on the transferring tab showing you the status of the transfer. From the overlay, you can add additional documents to the upload by clicking on the “+ Add files” button.
A status card will be added to the native data page corresponding to your upload. A time estimate will appear on this status card to indicate approximately how long it will be until processing is complete. As your upload progresses, you can start reviewing completed docs. You do not need to stay on this page for the upload to continue processing. Once all your files are successfully processed, you will see a document icon with a green checkmark in the status card. Clicking the icon will take you to a results table of your processed documents.
Native uploads will each be assigned a control number, indicated by a # prefix.
To learn how to view upload status, delete, rename, and take other actions, please see the “Managing native uploads” section.
Viewing and managing native uploads
Uploads will appear as cards in the “Native Data” section of the Uploads page. Upload cards are organized by month and ordered by date uploaded, with the most recent uploads at the top of the page.
You are able to filter upload cards by any combination of name, month of upload, or custodian.
Upload card components
On the upload card you can see:
- The upload name, which can be edited by clicking on it.
- A description of the upload, which can be edited by clicking on it
- The upload date.
- The total number of documents in an upload. Clicking on this number will open a results table with the documents in the upload.
- The number of documents that resulted in processing errors in either the examine, PDF, or text phases of processing. Clicking on these numbers will open a results table with the errored documents.
- The user who created the upload. Hovering over the user badge will display the full name of the user and allow you to easily message the user.
If the upload is still processing, you will also see status and progress information about the ongoing processing.
Viewing upload configuration settings
To see a summary of the configuration settings used to process the documents in an upload, select the “View configuration” option under the three-dot menu icon.
Deleting an upload
To delete an upload, click on the three dot menu icon and select the “Delete from database” option. You will be asked to confirm this action in a separate dialog that appears. Keep in mind that deleting an upload will delete the documents in the upload from all projects in the database that the upload was added to, including those that you may not be part of. In addition, all review work associated with those documents will be lost. This option is only available to users with the Delete permission.
Adding documents to an upload
For organizational purposes, it can be helpful to add additional files to an upload after the initial upload is complete. For example, you may want to keep all files from a single custodian in the same upload, even if documents are received in a rolling manner. Or, you may want to ensure that certain documents are all processed in the same way upon upload.
You can add additional documents to an existing upload by associating a new source file to that upload. This is accomplished through the upload details dialog, which can be opened by either clicking the “View report” button on the upload card or the “View upload details” option under the three-dot menu icon.
We will discuss this dialog in detail in the “Viewing upload details” section below. For now, navigate to the “Transferring” tab in the far left.
On this tab, you can see and download existing source files, and add new ones. Each source file represents a discrete document collection that was added to an upload. To add a new source file to an upload, click the “Add files” button at the top right of the source files table. You will be asked to select the location of the file (local or cloud). Then, follow the instructions in the dialog to complete your addition of a new source file.
Because the new source file is being added to an existing upload, document(s) in the file will be processed according to the configuration settings of the initial upload. There may be some exceptions to this rule depending on the source of the file, which will be reflected in the modifiable fields you can set as you complete the steps to add a new source file.
Viewing upload details
An upload details report is available to monitor and review the details of an upload. This report can be accessed by:
- Clicking the “View report” button at the bottom of the upload card (which will be called “View progress” if the upload is not yet complete), or
- The “View upload details” option under the three-dot menu icon.
An upload to Everlaw goes through two general phases: (1) transferring the upload file(s) to Everlaw for processing and (2) the processing itself. These phases are reflected in the tabs at the top of the upload details report. As an upload progresses, each tab will become active as the corresponding upload phase is reached. Once the upload is complete, the final tab “Upload report” tab will become active.
Let’s review what information you can expect to see in each tab of the report.
Upload details: Transferring tab
While a transfer is in progress, you can visit this tab to monitor the progress of the transfer.
Once a transfer is complete or errored, you can visit this tab to:
- See the success or failure status of the transfer
- See the source file(s) associated with the upload. You can also add new source files from this tab (see the “Adding new documents to an upload” section above for more information)
- Download any of the source files.
Cloud upload configuration
When performing native cloud collections, the collection configuration is saved and displayed on each group of sources. For example, in a Slack upload, information about the custodians, workspaces, channels and date range of collection can be viewed during and after uploads.
For cloud upload sources, you can view the collection configuration by clicking on the right arrow icon below the upload source and click on “Show cloud configuration” to view the cloud collection configuration.
Note that performing another cloud collection with the same configuration is not guaranteed to collect the same files since the cloud files may have changed in the meantime. For example, if OneDrive files have since been deleted, they might not be collected if you perform a subsequent collection with the same parameters.
In the case that multiple cloud sources are added to an upload card, the collection configuration for each source can be displayed in the same way by expanding and clicking “Show cloud configuration”. To learn more about adding files to an upload, see the previous section on adding additional files.
⚠️ Please note: Cloud upload configuration was added on the release starting Feb 3, 2023. Only cloud uploads made after the release will have the configuration displayed and uploads made before this date will not show the collection configuration (custodians, date range, etc.).
Upload details: Processing tab
While processing is ongoing, you can visit this tab to see processing status for each document included in an upload. You can also see summary counts of documents that have completed processing, documents being currently processed, and documents that are still queued for processing.
The table on this tab displays all documents that are currently being processed, along with the control number, processing status, size, and file name/path of those documents
This view updates every 5 seconds and is sorted in descending order of time since processing started on the files (files that take the longest to process are at the top).Additional information is provided on the type of processing the files are undergoing.. If there are more than 100 files being processed at the same time, the table shows the details of the longest running 100 files.
The possible processing steps and file types that have these steps are as follows:
|Processing step||File types|
|OCR||PDFs, TIFF files and other image files|
|Transcribing/Transcoding||Audio/video files, any file with audio|
Upload details: Upload report tab
The upload report tab includes a breakdown of various aspects of your upload and a visualization of the filetypes included in your upload.
If you want to download this information for use outside of Everlaw, you can:
- “Download detailed report”, which will result in a CSV with information about each document in the upload
- “Print upload report”, which will generate a PDF version of the exact view you see while on the upload report tab
Taking each section of the upload report in turn:
The “Upload size” section displays: (1) the size of the source file(s), (2) the size and count of deduped and deNISTed documents; and (3) the billable size of the upload documents.
If you want a report of the deduplicated documents, you can click on the “download deduped info” report. This report will show the original path, original Bates, and duplicate path of each deduplicated document:
- Original Path: Native path for the "original" document (for each set of duplicates, the single instance of the document that was uploaded to Everlaw)
- Original Bates: Begin Bates numbers for the original documents
- Duplicate Path: Native path for the deduplicated document associated with the original document
If there are multiple duplicates for one original path, there will be a row included for each duplicate. This means that multiple rows can have the same original path and original Bates values.
The “Filetypes” section shows the count of each type of file processed in the upload and added to the project. It therefore excludes documents that were deduped or deNISTed. Clicking on any of the counts will open a results table with all documents in the project of that type. The pie graph visualization on the right allows you to get a quick sense of the relative proportion of document types in the upload, as well as a quick way to access all documents in the upload.
Documents with issues
The “Documents with issues” section shows the number of documents that registered errors during processing, broken down by processing stage. You can review these errored files by clicking on any of the counts.
Flagged as malicious
After files are examined, converted to PDF, and OCR’d to create searchable text files, native files will be scanned for viruses or other malware. The total number of documents that have a virus or malware will be flagged and identified here. You can click on the count to open a results table with these documents. These documents are safe to review on Everlaw, but you will want to be cautious about downloading them to your personal computer or local environment. For more information, please see this article on identifying and troubleshooting native upload errors.
OCRed and Imaged
The final section displays counts for the total number of pages either OCRed or imaged as a result of processing. You can access the underlying documents by clicking on the counts.
Cross-upload reports allow you to easily download information about multiple uploads. There are two types of cross-upload reports:
- A per upload report for each selected upload
- A per document report for each document in the selected uploads
You can download these reports by clicking on the “Download reports” button next to the filter boxes at the top of the native upload page and making a selection.
Once you’ve selected a report, a task will be started to generate the report. You can monitor the status of such tasks – and download the resulting reports – on the “Batches and Exports” column of the homepage.
By default, the report will span all uploads. However, you can specify a smaller subset of uploads or documents by using the filter boxes. Once you’ve filtered your uploads:
- The upload report will only include information about the uploads captured in the filter
- The document report will only include documents in the uploads captured by the filter. If a custodian filter is used, the documents will be further filtered such that only documents with that particular custodian value will be included.
Uploads and documents that are still in progress or are being reprocessed are not included in the report.
The per upload report generates a CSV that has 1 row for each upload (taking into account filters). The included fields are:
|Upload id||Unique number identifier for an upload|
|Upload name||Name of the upload|
|Upload date||Datetime of the upload in UTC|
|Upload description||Description of the upload|
|Project name(s)||Name(s) of the projects the upload is associated with|
|Project id(s)||Unique identifier(s) of the associated project(s)|
|Uploader||Name of the user performing the upload|
|Custodian(s)||Custodian(s) associated with the upload|
|Associated with Everlaw legal hold||Whether the upload is associated with a legal hold in Everlaw|
|Name of legal hold||Name of the associated legal hold|
|Upload source(s)||The local or cloud source(s) of source files in the upload|
|Size of source(s)||The summed size of all source files associated with an upload|
|Default timezone||Default timezone configuration set upon upload|
|Begin date filter(s)||The start datetime of any filter used to retrieve the source file(s) in UTC. The number of values in this field should correspond to the number of values in the "Upload source(s)" field, and can be mapped to the appropriate source by order.|
|End date filters(s)||The end datetime of any filter used to retrieve the source file(s) in UTC. The number of values in this field should correspond to the number of values in the "Upload source(s)" field, and can be mapped to the appropriate source by order.|
|PDF image setting||Image setting configuration set upon upload|
|Hyperlink images||Fetch hyperlinked image configuration set upon upload|
|OCR language setting||OCR language configuration set upon upload|
|Page size setting||Page size configuration set upon upload|
|Inline image setting||Inline image configuration set upon upload|
|Deduplication setting||Deduplication configuration set upon upload|
|Size of deNisted documents||The file size of the documents deNisted during processing|
|Count of deNisted documents||The number of documents deNisted during processing|
|Size of deduplicated documents||The file size of the documents deduplicated during processing|
|Count of deduplicated documents||The number of documents deduplicated during processing|
|Total number of documents post-processing (current)||The number of processed documents from the upload. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of attachments (current)||The number of processed documents from the upload that are attachments. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of password-protected documents (current)||The number of processed documents from the upload that were password-protected. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of OCRed pages (current)||The number of OCRed pages across all the documents from the upload. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of imaged pages (current)||The number of imaged pages across all the documents from the upload. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Internal control number ranges (current)||The control numbers assigned to the documents in the upload. If there are consecutive assignments, the range will be reported as [start of range] - [end of range].
If there are more than 20 separate ranges associated with an upload, the min and max numbers will be reported instead as: [min control number] ~ [max control number]. Documents in the upload are guaranteed to fall within this min-max range, but the range could also include control numbers that are not assigned to documents in the upload.
This field is based on the current state of the database, and will reflect any subsequent document deletions.
|Number of documents with processing errors (current)||The number of documents in the upload that had processing errors. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of documents with examine errors (current)||The number of documents in the upload that had errors at the examine stage of processing. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of documents with PDF errors (current)||The number of documents in the upload that had errors at the PDF image generation stage of processing. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of documents with text errors (current)||The number of documents in the upload that had errors at the text stage of processing. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of documents with container errors (current)||The number of documents in the upload that had container errors. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of documents flagged as malicious (current)||The number of documents in the upload that were flagged as malicious during processing. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|Number of password-protected documents that cannot be opened (current)||The number of password-protected documents in the upload that could not be opened during processing. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|* Document count by type||1 column for each document type. These fields are based on the current state of the database, and will reflect any subsequent document deletions.|
|Billable size (current)||The billable size of the documents in the upload. This field is based on the current state of the database, and will reflect any subsequent document deletions.|
|* Document set links||Links back into the platform for: documents in the upload, documents with missing passwords, documents with examine errors, documents with PDF errors, documents with text errors, documents with container errors, and documents flagged as malicious.|
The per document report generates a CSV that has 1 row for each document in the upload(s), including those that were deduplicated or deNISTed. The included fields are:
- Document id (Control #) - This is blank for Deduplicated or DeNISTed files
- Upload dataset name - This can be changed by renaming the upload card
- Upload dataset id - This is fixed regardless of changes to upload dataset name
- Upload date (UTC)
- File path
- Custodian - Blank if no custodian was assigned at upload
- Processing/Production Flags - the current contents of the “Processing/Production Flags” Special Column, if any. (This is any of the flags available in the Uploaded document search term’s “flags” parameter.)
- Processing error - true/false
- OCR - true/false
- Reprocess date - The most recent reprocessing date and time if any
- Reprocessed by - First and Last Name of user
- Deduplicated - true/false
- DeNISTed - true/false
- Uploaded by - First and Last Name of user
⚠️ Please note: Upload time and user tracking was added on the release June 3, 2022. Before this date, the fields Reprocess date, Reprocessed by, Upload date, Uploaded by are always blank for documents uploaded prior to June 3 as Everlaw did not previously store information for these fields. New sources added to these uploads will correctly show the uploading/reprocessing user and time in the report CSV.
Additionally, for legacy users without First/Last Names, their email or username will appear in the Uploaded by or Reprocessed by field instead.
Native data processing settings
- The orientation of documents is preserved from its native version (e.g., a document that is in landscape orientation will remain that way upon upload)
- Embedded files: Everlaw will extract all embedded files, including audio and video files, in a Microsoft 365 file (e.g., an Excel file embedded in a PowerPoint) and any file embedded in a PDF.
- Emails containing URL links to other supported documents will be recognized. See the context panel article to learn more.
- The children of container files are extracted with no limitation on depth. For example, a Word document attached to an email that’s attached to another email that’s in a Zip file that’s in another Zip file will be extracted.
- Hidden columns in Excel are displayed.
- Notes are extracted and presented in the PDF/Image view for Word documents and the Native view for spreadsheets.
- Sometimes, you may try to upload an entire hard drive or a folder with personal files mixed in with system/software files. Some of these files have no user-specific content and can be removed upon processing. This process is called deNIST (removing NIST files). Any files that are on the NIST list will qualify for deNISTing automatically upon upload. Binary files, and virtually all containers, are not part of this list and will not be removed.