Improving Your Predictive Coding Model's Performance

There are multiple ways to improve the accuracy of your predictive coding models. This article will walk through different methods for improving predictive coding model performance.

Weighting Terms

As you review documents, your predictive coding model learns about which types of documents should be considered relevant or irrelevant. For example, default Rating models look at the documents that have been rated Hot or Warm in order to predict which other documents are likely to be relevant, as well. When a document is rated Hot or Warm, the Rating model looks at and learns from the entire document. However, some documents contain portions of text that are more relevant to the model than other portions. To teach the model which parts of your documents are especially relevant, you can add them as weighted terms to the model.

To weight terms and teach your model which parts of a document are especially relevant, navigate to the text view of the relevant document. Then, make sure the Predictive Coding tab is added to your review window sidebar.

Once the Predictive Coding tab is active, select the relevant portions of text and choose which model(s) they are relevant to. You can dissociate text from a model by clicking the X next to the model’s name.

highlight_and_remove_pc.gif

You can scroll through each model’s weighted selections from the Predictive Coding tab.

scroll_through_hits_new.gif

When you weight text in a document and associate it with a model, all of the weighted text in the document is treated as a single relevant document for the purposes of training the model. Please note that the model will be trained on the individual terms (i.e., words) that are selected in the document, but the order of the selected terms (i.e., phrases) will not be considered during training.  

You can view all of the weighted terms for a model under Weighted Terms in the Training section of the model’s page.

weighted_table.png

You can click on the rows of weighted text to open the document, as well as dissociate the selection from the model by clicking the trash can icon. If the weighted terms are associated with any other models, they will remain associated with them.  

Return to table of contents

Creating Training Sets

Anytime you review documents in your project according to your model’s review criteria (e.g., rating, responsiveness), you are teaching your model what types of documents are and are not relevant. However, training sets can be a helpful tool to organize documents for review. Training sets provide you an easy way to intentionally train your model on different types of documents. Creating a training set will pull a random sample of unreviewed documents into a binder that have sufficient text and are unique (i.e., no duplicates) for your team to review. Reviewing documents from a training set helps the model learn whether various types of documents (e.g., emails from person X, documents containing the word “California,” chat messages to person Y) tend to be relevant to your matter or not. Otherwise, the model will only learn about the types of documents you frequently review in your typical review workflow, which might not be representative of all the documents in your project.

To create a training set, navigate to the "Training sets" under the Training section of your predictive coding model's page and click the "+New Training Set" button. 

Enter a name for your training set, and then select whether you want the documents in your training set to be randomly sampled from your project, or if you want the documents to come from a specific search.

By using randomly selected documents for your training set, your model will be trained on documents that are more representative of your project as a whole. However, if there is a particular subset of documents you’d like to train your model on, you can seed your set from a specific search. This option may be helpful if there’s a search that you’re confident will return relevant documents. That way, you can make sure the model sees some relevant documents during its training. However, to prevent biased training of your model, a subset of your training set will still come from a randomly sampled subset of your project. Training sets will not contain duplicate documents, ineligible documents (i.e., insufficient text), holdout set documents, or documents excluded from the model. This ensures that all documents reviewed from training sets will be used to train the model. Note that older prediction models may contain legacy training sets, indicated with a red asterisk. Learn more about how legacy training sets were created differently.

After your training set has been generated, access and begin review on the documents by clicking the blue numbers. If the “Generating” status on a newly created training set does not automatically change after the task is complete, refresh your page. 

Training sets are created for each specific model and will only be displayed within the “Training sets” section of the associated predictive coding model page. Their review progress will only take into account whether documents have been reviewed according to the associated model’s “Reviewed” criteria.

Legacy Training Sets 

Please note that training sets created before the July 28, 2023 release, referred to as “legacy training sets” and indicated by a red asterisk, are shared across all predictive coding models in the same project. This means that any legacy training sets created within a model will also be displayed in the Training Sets section of any other models in the same project. Also, legacy training sets did not contain duplicate documents but may contain ineligible documents (i.e., insufficient text), holdout set documents, and documents excluded from the model. As a reminder, ineligible and excluded documents are not used to train the model or evaluate its performance and do not receive model prediction scores. Although reviewed holdout set documents receive prediction scores and are used to evaluate the model’s performance, they are not used to train the model. 

Return to table of contents

Improving Document Coverage

Coverage is a measure of how well the features of a document (e.g., words, metadata) are represented in the documents that the model has been trained on. For example, let’s imagine a model that has been trained on many documents containing the words “secret,” “meeting,” and “payoff.” When the model sees a new document containing those words, it will be able to make a more accurate prediction about the document than a model that has never been trained on documents containing those words.

To improve your model’s predictions, you can review documents that are not covered well (i.e., documents with features the model has not encountered during training). There are two places where you can find poorly covered documents on your model’s page. First, you can find documents that are not well covered (i.e., that have a coverage score of 20% and below) under Action Items at the top of your model’s page. Click Review to open them in the results table.

You can see the coverage scores for all of your project’s documents at the bottom of your model’s page under Coverage (in the Training section). You can select documents with low coverage scores (denoted by the y-axis) for review.

As you review more documents with low coverage scores, your model will make better predictions.  

Return to table of contents

Tracking Your Model’s Performance

Your model’s performance metrics measure the recall, precision, and F1 scores of your model.

  • Recall: estimates what percentage of the relevant documents in your project your model is finding;

  • Precision: estimates what percentage of the relevant documents your model finds are actually relevant; and 

  • F1: calculates the optimal balance of recall and precision.

You can view your model’s performance metrics under Performance in the Results section.

As your model makes predictions, its performance metrics can give you a sense of the model’s confidence in its results. If you notice that the model’s performance is poor on one or multiple dimensions, you can take steps to help improve your model.

If the model’s precision score is low (i.e., it is returning many false positives), you can do the following things to improve precision:

1) Review documents that are predicted relevant, but have not yet been reviewed. If the documents that have been predicted relevant are not actually relevant, you can train the model by reviewing them as irrelevant. Find these documents under Prioritize in the Action Items section of your model’s page.

2) QA documents that were originally reviewed as relevant. If the model is incorrectly predicting many documents to be relevant, it is possible that the original input (i.e., training) was incorrect. To see these documents, simply do a search for your model’s “reviewed” criteria.

On the other hand, if the model’s recall score is low (i.e., it is missing many relevant documents in the project), there are also two things you can do:

1) Review documents that have not been reviewed, but were predicted irrelevant. If you find and review any documents that are relevant (despite being predicted otherwise), this will help correct the model into making better predictions. You can find these documents by constructing a search which captures documents that 1) do NOT satisfy the model’s “reviewed” criteria and 2) have a prediction score at or around the model's max F1 score (and are thus categorized as irrelevant). For example:

2) QA documents that were originally reviewed as irrelevant. If the model is incorrectly predicting many documents to be irrelevant, it is possible that the original input (i.e., training) was incorrect. To QA these documents, run a search that captures documents that have been reviewed according to your model’s “reviewed” criteria, but were NOT reviewed relevant (i.e., were reviewed irrelevant). For example:

Please note that rigorous performance metrics will provide the most accurate measure of your model’s performance across all documents in your project. Therefore, even if your model’s basic performance metrics are good, it may be helpful to view the model’s rigorous performance metrics, as well.

Return to table of contents