Improving Your Predictive Coding Model's Performance – Knowledge Base

There are multiple ways to improve the accuracy of your predictive coding models. An improved model makes predictions that more closely align with final coding decisions. A better-performing model typically results in better recall and/or precision, and often a higher F1 score, which reflects a more optimal balance between the two. This article will walk through the following methods for improving predictive coding model performance:

Weighting Terms
Creating Training Sets
Improving Document Coverage
Tracking Your Model’s Performance

Requirements

To take actions related to a predictive coding model, you must meet one of the following:

Have created the model
Have the model shared with you with Edit permissions
Have Admin permission for Predictive Coding

Weighting Terms

As you review documents, your predictive coding model learns about which types of documents should be considered relevant or irrelevant. Some longer documents might contain sections that are relevant for training your model along with large portions that are not relevant. To show your model the part(s) of your documents that are especially relevant, add the relevant portion(s) as weighted terms to the model. Weighted text in a document is pulled together and identified as a relevant document for training the model, which helps your model learn from the most relevant portions.

This is especially useful for long documents in which only a section is relevant. For example, a document of a long email thread with only one or two relevant emails, or an insurance policy with just one relevant clause. You can weight just the relevant portion(s) of each document, and the model will treat that text as a relevant document for training.

Note

If the document is also coded as relevant based on the relevancy criteria of the model, the model is trained on both the full text of the document and the "document" of weighted text.

Tip

There is no need to add weighted terms in as many documents as possible; predictive coding is designed to learn what's relevant without a user pointing out specific terms. Weighting text is specifically useful for documents with relevant passages mixed among lots of irrelevant content.

To weight terms and teach your model which parts of a document are especially relevant:

Navigate to the text view of a relevant document.
Then, make sure the Predictive Coding tab is added to your review window sidebar.
Once the Predictive Coding tab is active, select a relevant passage of text within the document.
This opens a menu of your models. Select which model(s) it is relevant to. You can select multiple models.

This adds the term to the model.

Work with weighted terms

As you add terms within a document, you can navigate them using the Predictive Coding tab.

To dissociate a passage from a model, select the term within the text file, then select the X next to the model’s name.

In the Document Analytics > Predictive Coding page for your model, the Weighted terms section of the dashboard displays the highlighted terms and the Bates/Control # of the document each highlight comes from.

You can take two actions:

To open the document, select the row weighted text to open the document
To dissociate the selection from the model, select the Remove weighted terms button. If the weighted terms are associated with any other models, they will remain associated with them.

Creating Training Sets

Anytime you review documents in your project according to your model’s review criteria (e.g., rating, responsiveness), you are teaching your model what types of documents are and are not relevant. However, training sets can be a helpful tool to organize documents for review. Training sets provide you an easy way to intentionally train your model on different types of documents. Creating a training set will pull a random sample of unreviewed documents into a binder that have sufficient text and are unique (i.e., no duplicates) for your team to review. Reviewing documents from a training set helps the model learn whether various types of documents (e.g., emails from person X, documents containing the word “California,” chat messages to person Y) tend to be relevant to your matter or not. Otherwise, the model will only learn about the types of documents you frequently review in your typical review workflow, which might not be representative of all the documents in your project.

To create a training set, navigate to the "Training sets" under the Training section of your predictive coding model's page and click the "+New Training Set" button.

Enter a name for your training set, and then select whether you want the documents in your training set to be randomly sampled from your project, or if you want the documents to come from a specific search.

By using randomly selected documents for your training set, your model will be trained on documents that are more representative of your project as a whole. However, if there is a particular subset of documents you’d like to train your model on, you can seed your set from a specific search. This option may be helpful if there’s a search that you’re confident will return relevant documents. That way, you can make sure the model sees some relevant documents during its training. However, to prevent biased training of your model, a subset of your training set will still come from a randomly sampled subset of your project. Training sets will not contain duplicate documents, ineligible documents (i.e., insufficient text), holdout set documents, or documents excluded from the model. This ensures that all documents reviewed from training sets will be used to train the model. Note that older prediction models may contain legacy training sets, indicated with a red asterisk. Learn more about how legacy training sets were created differently.

After your training set has been generated, access and begin review on the documents by clicking the blue numbers. If the “Generating” status on a newly created training set does not automatically change after the task is complete, refresh your page.

Training sets are created for each specific model and will only be displayed within the “Training sets” section of the associated predictive coding model page. Their review progress will only take into account whether documents have been reviewed according to the associated model’s “Reviewed” criteria.

Improving Document Coverage

Coverage is a measure of how well the features of a document (e.g., words, metadata) are represented in the documents that the model has been trained on. For example, let’s imagine a model that has been trained on many documents containing the words “secret,” “meeting,” and “payoff.” When the model sees a new document containing those words, it will be able to make a more accurate prediction about the document than a model that has never been trained on documents containing those words.

To improve your model’s predictions, you can review documents that are not covered well (i.e., documents with features the model has not encountered during training). There are two places where you can find poorly covered documents on your model’s page. First, you can find documents that are not well covered (i.e., that have a coverage score of 20% and below) under Action Items at the top of your model’s page. Click Review to open them in the results table.

You can see the coverage scores for all of your project’s documents at the bottom of your model’s page under Coverage (in the Training section). You can select documents with low coverage scores (denoted by the y-axis) for review.

As you review more documents with low coverage scores, your model will make better predictions.

Tracking Your Model’s Performance

Your model’s performance metrics measure the recall, precision, and F1 scores of your model.

Recall: estimates what percentage of the relevant documents in your project your model is finding;
Precision: estimates what percentage of the relevant documents your model finds are actually relevant; and
F1: calculates the optimal balance of recall and precision.

You can view your model’s performance metrics under Performance in the Results section.

As your model makes predictions, its performance metrics can give you a sense of the model’s confidence in its results. If you notice that the model’s performance is poor on one or multiple dimensions, you can take steps to help improve your model.

If the model’s precision score is low (i.e., it is returning many false positives), you can do the following things to improve precision:

Review documents that are predicted relevant, but have not yet been reviewed. If the documents that have been predicted relevant are not actually relevant, you can train the model by reviewing them as irrelevant. Find these documents under Prioritize in the Action Items section of your model’s page.
QA documents that were originally reviewed as relevant. If the model is incorrectly predicting many documents to be relevant, it is possible that the original input (i.e., training) was incorrect. To see these documents, simply do a search for your model’s “reviewed” criteria.

On the other hand, if the model’s recall score is low (i.e., it is missing many relevant documents in the project), there are also two things you can do:

Review documents that have not been reviewed, but were predicted irrelevant. If you find and review any documents that are relevant (despite being predicted otherwise), this will help correct the model into making better predictions. You can find these documents by constructing a search which captures documents that 1) do NOT satisfy the model’s “reviewed” criteria and 2) have a prediction score at or around the model's max F1 score (and are thus categorized as irrelevant). For example:
QA documents that were originally reviewed as irrelevant. If the model is incorrectly predicting many documents to be irrelevant, it is possible that the original input (i.e., training) was incorrect. To QA these documents, run a search that captures documents that have been reviewed according to your model’s “reviewed” criteria, but were NOT reviewed relevant (i.e., were reviewed irrelevant). For example:

Please note that rigorous performance metrics will provide the most accurate measure of your model’s performance across all documents in your project. Therefore, even if your model’s basic performance metrics are good, it may be helpful to view the model’s rigorous performance metrics, as well.