Introduction
This article covers the steps to create a new predictive coding model and to edit an existing model. Predictive coding is a great tool to have at your disposal if you want to facilitate an efficient review of large data sets. You can use a model to prioritize the documents assigned to reviewers or to support a more automated review workflow.
If you have never worked with a predictive coding model before, you should first check out our beginner’s guide to predictive coding article. You can also use Predictive Coding Terms and FAQs as a guide to predictive coding-related terms and for answers to commonly-asked questions about predictive coding.
To view all of Everlaw's predictive coding-related articles, please see our predictive coding section.
Predictive Coding Permissions
To create a predictive coding model, you must have Create or Admin permissions on predictive coding.
Once a model has been created, additional users can access the model, depending on their permissions. Here is a description of each permission type, which a Project Admin can update in Project Settings:
- Receive: Receive prediction models shared by others, but cannot create new prediction models. When a prediction model is shared with a user in this group, they can be given View, Edit, or Full access permissions on the individual prediction model.
- Create: Create prediction models as well as edit and delete the prediction models they have created. Permission levels on Everlaw are additive, so groups with the Create permission are able to to receive prediction models shared by others.
- Admin: Able to view, edit, share and delete all prediction models in the project, regardless of whether the prediction model was shared with them or not.
If you are a Project Administrator, you have Admin access to predictive coding by default. Project Administrators can also give Prediction Model access to specific groups in the Permissions page.
Create a Predictive Coding model
Overview
Creating a predictive coding model happens in three steps:
First, you identify the documents that count as reviewed. These are the documents that the model will analyze and learn from. Typically, the Reviewed documents are those that have been coded within a specific coding category. If you are looking for responsive documents, the reviewed documents would be documents coded within the Responsiveness category, including those coded both responsive and not responsive.
Second, you identify the documents that are relevant for this model. These are the documents that you want to find more of. For example, if you want the model to identify responsive documents, you should choose documents that have been coded Responsive in the Responsiveness category as the relevant ones.
Third, you identify any documents you want to exclude from the model. Excluded documents are not included in training the model, even if they meet your Reviewed criteria.
Access Predictive Coding
To get started building a model:
-
Go to Document Analytics > Predictive Coding.
-
Select + New model from the left side menu.
Users who (A) belong to the organization that the current project belongs to, (B) have Admin permissions on Predictive Coding, and (C) have at least one multi-matter model available will have the option to create a new multi-matter model. To learn more, read our article on Multi-matter models.
Otherwise, you are taken directly to creating a new model. - The first page of the predictive coding wizard is an introduction to predictive coding and a link to the Everlaw predictive coding beginner’s guide. Select Next to begin building your model.
Reviewed Docs
In the Review docs step, you specify which documents the model should learn from. These documents are considered “reviewed” for the purposes of the model.
As an example, let’s say your team is reviewing documents for responsiveness, using the codes Responsive and Not Responsive under the coding category Responsiveness. To teach the model which types of documents are responsive and which are not, the model needs to be pointed towards documents your team has already reviewed for responsiveness.
To do this, set your criteria for reviewed documents to be “Coded under Responsiveness”. The model will look at all documents that have been coded in that category to help it understand the characteristics of documents that are responsive, as well as the characteristics of documents that are not responsive. The model needs to learn from both types of information to have a good understanding and to target your desired documents.
To set your Reviewed criteria:
- Use the query builder to identify the documents that should be considered reviewed.
- When you're done, select Next.
Relevant docs
In the Relevant docs step, specify which types of documents you want the model to find. These documents are considered "relevant" to the model. For our responsiveness model, responsive documents are relevant to the model. In other words, we want the model to find responsive documents.
To specify relevant documents, build a query that captures only those documents that you want to find more of. In our responsiveness example, we would build a query that captures documents coded Responsive.
Important
Documents that you define as Relevant must also fit into your Reviewed criteria. For example, if the Reviewed criteria is a coding category, the Relevant criteria must be a code (or codes) within that category.
To set your Relevant criteria:
-
Use the query builder to identify the criteria for documents that should be considered relevant.
- When you're done, select Next.
Excluded docs
The exclusion step allows you to specify documents you want to exclude from your model, even if they have been reviewed in a way that matches the model’s criteria. The model will NOT:
- Use excluded documents for training and evaluation purposes
- Generate predictions for excluded documents
By default, documents produced in Everlaw are excluded from models. This is meant to prevent redundant training and duplicative document suggestions. However, you may choose to remove this default exclusion criteria, if desired.
One common reason to exclude documents is if you know certain classes of documents have either atypical content or insufficient text for reliable predictions. For example, you may want to exclude documents primarily in a non-English language, as well as documents with little textual content, like spreadsheets and audio files. If you do not exclude any documents, all documents with adequate text (including transcribed audio and video files) will receive prediction scores.
To specify excluded documents:
-
Specify the criteria that should cause a document to be excluded from the model.
Important
If you specify more than one criteria for exclusion, the terms should be connected using an OR logical operator. This ensures that documents meeting any one of the criteria that you specify should be excluded.
- Select Next.
Name your model
The final step is to give your model a name. The model name defaults to a description of the relevant and reviewed criteria, but you can change this.
To name your model:
- Click into the text entry bar, delete the default name, and type your desired model name.
- When you're done, select Submit. This finalizes your model.
Your complete model
Generally, your model will begin making predictions once you have met the review threshold. To meet the review threshold:
- Review at least 200 qualified documents (i.e., sufficient text, unique, and not in conflict).
- At least 50 of those qualified documents must be relevant
- At least 50 must be irrelevant (reviewed, but not meeting the relevant criteria)
- Irrelevant documents that are near duplicates or emails in the same thread as relevant documents do not count towards the 50 irrelevant documents needed to meet the training threshold. These documents are considered conflicts.
If it seems that you have hit the review threshold but are not seeing an update to your model, it's likely that you need to review more qualified documents in your training set or that you have too many conflicts that are irrelevant. Learn more about kicking off your predictive coding model here.
Share a model
You can share your model before the model has generated any predictions. To share a model:
-
Select the share button in the top right of the model's page. This opens a dialog to share the model.
- Choose the recipient(s).
- Choose their Permission on the model. See the requirements section above for a description of the permissions.
- [Optional] Write a message.
-
Select Share.
Edit a predictive coding model
Permissions required: You must be the one who created the model, have Admin permissions on predictive coding, or have had the model shared with you with Edit or Full Access permissions. You can tell if you have edit permissions on a model if you see a pencil towards the top right corner of the model’s page.
At any point after you create a predictive coding model, you can edit your model’s name and criteria for reviewed, relevant, and excluded documents.
Edit model name
You can edit a predictive coding model’s name on the model’s page. To edit a predictive coding model:
- Access the page of the model whose name you want to edit.
- Click the name of the model at the top of the model’s page.
-
Start making edits.
- When you're done making your edits, press Enter on your keyboard or click out of the name area. Your model’s name should immediately update at the top of the model page and on the left hand menu.
You can also edit your model’s name along with your model’s criteria in the predictive coding wizard as described in the next section.
View and edit model criteria
To view and edit the criteria of an existing model:
-
Select the pencil on the model’s page to open the predictive coding wizard. In the predictive coding wizard, you can view your model’s current criteria and name.
- Select Next to access the Reviewed page.
- Make any desired changes to the reviewed criteria.
- Select Next.
- Make any desired changes to the relevant criteria.
- Select Next.
- Make any desired changes to the excluded criteria.
- Select Next.
- Make any desired changes to the model's name.
- Select Submit. This queues your model for an update.
Note that it may take a few hours for the model update to complete and implement changes to your model’s criteria.
Once the model update is complete, assuming that the review threshold has been met:
- New prediction scores are generated
- Any existing workflows referencing the edited model using the Predictive search term (e.g. inclusion criteria for assignment groups) are updated accordingly.
View model edit history
Users with at least Receive permissions can view a model’s edit history. To do so, select the activity button located at the top right of the model’s page. All edits to a model are captured in the model's edit history.
To read more about analyzing your model’s results, see the Predictive Coding Model Interpretation article.