There are multiple ways to improve the accuracy of your predictive coding models. This article will walk through different methods for improving predictive coding model performance.
- Weighting Terms
- Creating Training Sets
- Improving Document Coverage
- Tracking Your Model’s Performance
As you review documents, your predictive coding model learns about which types of documents should be considered relevant or irrelevant. For example, default Rating models look at the documents that have been rated Hot or Warm in order to predict which other documents are likely to be relevant, as well. When a document is rated Hot or Warm, the Rating model looks at and learns from the entire document. However, some documents contain portions of text that are more relevant to the model than other portions. To teach the model which parts of your documents are especially relevant, you can add them as weighted terms to the model.
To weight terms and teach your model which parts of a document are especially relevant, navigate to the text view of the relevant document. Then, make sure the Predictive Coding tab is added to your review window sidebar.
Once the Predictive Coding tab is active, select the relevant portions of text and choose which model(s) they are relevant to. You can dissociate text from a model by clicking the X next to the model’s name.
You can scroll through each model’s weighted selections from the Predictive Coding tab.
When you weight text in a document and associate it with a model, all of the weighted text in the document is treated as a single relevant document for the purposes of training the model. Please note that the model will be trained on the individual terms (i.e., words) that are selected in the document, but the order of the selected terms (i.e., phrases) will not be considered during training.
You can view all of the weighted terms for a model under Weighted Terms in the Training section of the model’s page.
You can click on the rows of weighted text to open the document, as well as dissociate the selection from the model by clicking the trash can icon. If the weighted terms are associated with any other models, they will remain associated with them.
Creating Training Sets
Anytime you review documents in your project according to your model’s review criteria (e.g., rating, responsiveness), you are teaching your model what types of documents are and are not relevant. As an additional tool to help your model learn about different types of documents, training sets create binders of randomly sampled documents for your team to review. Reviewing a random set of documents helps the model learn whether various types of documents (e.g., emails from person X, documents containing the word “California,” chat messages to person Y) tend to be relevant to your matter or not. Otherwise, the model will only learn about the types of documents you frequently review in your typical review workflow, which might not be representative of all the documents in your project.
To create a training set, navigate to the Training section of your predictive coding model. At the bottom are your project’s training sets.
Then, click “Add a training set.” Enter a name for your training set, and then select whether you want the documents in your training set to be randomly sampled from your project, or if you want the documents to come from a specific search.
By using randomly selected documents for your training set, your model will be trained on documents that are more representative of your project as a whole. However, if there is a particular subset of documents you’d like to train your model on, you can seed your set from a specific search. This option may be helpful if there’s a search that you’re confident will return relevant documents. That way, you can make sure the model sees some relevant documents during its training. However, to prevent biased training of your model, a subset of your training set will still come from a randomly sampled subset of your project.
Once you’ve created your training set, its status will be “pending” until you refresh your page. From there, you can open up the training set by clicking the blue number. You can then begin reviewing the documents, or assign them out.
Please note that training sets are shared across all predictive coding models in a project; the Training Sets section of each model will therefore display all training sets in the project, even if you did not create it from within that model. Additionally, note that training set review progress only takes into account whether the documents have received a rating or not.
Finally, as previously mentioned, reviewing documents via your normal review workflow (e.g., running searches, reviewing documents in assignments) also trains your model; training sets are simply another tool for organizing documents for review.
Improving Document Coverage
Coverage is a measure of how well the features of a document (e.g., words, metadata) are represented in the documents that the model has been trained on. For example, let’s imagine a model that has been trained on many documents containing the words “secret,” “meeting,” and “payoff.” When the model sees a new document containing those words, it will be able to make a more accurate prediction about the document than a model that has never been trained on documents containing those words.
To improve your model’s predictions, you can review documents that are not covered well (i.e., documents with features the model has not encountered during training). There are two places where you can find poorly covered documents on your model’s page. First, you can find documents that are not well covered (i.e., that have a coverage score below 80%) under Action Items at the top of your model’s page. Click Review to open them in the results table.
You can see the coverage scores for all of your project’s documents at the bottom of your model’s page under Coverage (in the Training section). You can select documents with low coverage scores (denoted by the y-axis) for review.
As you review more documents with low coverage scores, your model will make better predictions.
Tracking Your Model’s Performance
Your model’s performance statistics measure the recall, precision, and F1 scores of your model. These scores estimate: what percentage of the relevant documents in your project your model is finding (recall); what percentage of the relevant documents your model finds are actually relevant (precision); and the optimal balance of recall and precision (F1). You can view your model’s performance statistics under Performance in the Results section.
As your model makes predictions, its performance statistics can give you a sense of the model’s confidence in its results. If you notice that the model’s performance is poor on one or multiple dimensions, you can take steps to help improve your model.
If the model’s precision score is low (i.e., it is returning many false positives), you can do the following things to improve precision:
1) Review documents that are predicted relevant, but have not yet been reviewed. If the documents that have been predicted relevant are not actually relevant, you can train the model by reviewing them as irrelevant. Find these documents under Prioritize in the Action Items section of your model’s page.
2) QA documents that were originally reviewed as relevant. If the model is incorrectly predicting many documents to be relevant, it is possible that the original input (i.e., training) was incorrect. To see these documents, simply do a search for your model’s “reviewed” criteria.
On the other hand, if the model’s recall score is low (i.e., it is missing many relevant documents in the project), there are also two things you can do:
1) Review documents that have not been reviewed, but were predicted irrelevant. If you find and review any documents that are relevant (despite being predicted otherwise), this will help correct the model into making better predictions. You can find these documents by constructing a search which captures documents that 1) do NOT satisfy the model’s “reviewed” criteria and 2) have a prediction score at or around the model's peak F1 score (and are thus categorized as irrelevant). For example:
2) QA documents that were originally reviewed as irrelevant. If the model is incorrectly predicting many documents to be irrelevant, it is possible that the original input (i.e., training) was incorrect. To QA these documents, run a search that captures documents that have been reviewed according to your model’s “reviewed” criteria, but were NOT reviewed relevant (i.e., were reviewed irrelevant). For example:
Please note that rigorous performance statistics will provide the most accurate measure of your model’s performance across all documents in your project. Therefore, even if your model’s basic performance statistics are good, it may be helpful to view the model’s rigorous performance statistics, as well.