Leveraging Predictive Coding for Prioritization and Quality Control – Knowledge Base

Creating a predictive coding model early will enable you to leverage its learning throughout the entire document review process. This article will outline how to leverage predictive coding for simple document prioritization and quality control. These internal workflows increase review efficiency by prioritizing the most relevant documents sooner and automating quality control workflows.

Building a predictive coding model to prioritize review is a step towards more advanced workflows to determine the relevancy of documents without subjecting the entire document set to manual review. By leveraging their model’s prediction scores and performance metrics, attorneys can use predictive coding to inform their assessment of the remaining review burden and/or decision to terminate document review, replacing the need for exhaustive manual document review.

To begin to prioritize documents or conduct quality control, you will need to have at least one predictive coding model in your project that has generated prediction scores. You can create a predictive model based on any document or review attributes in Everlaw. It is recommended that you build a predictive coding model in which the criteria for relevance is related to one of your ultimate document review goals. Some examples include identifying privilege, responsiveness, or relevance. You can learn more about setting up your predictive coding model by reading our beginner’s guide, walking through how to create and edit a predictive coding model, or attending a live training session.

In Everlaw, a predictive coding model will start to generate prediction scores after 200 qualified documents have been reviewed with at least 50 relevant and 50 irrelevant pursuant to the model’s criteria. Learn more about kicking off your predictive coding model here.

Once your model has begun to generate predictions, you can begin to incorporate predictive coding into your document review with the simple workflows outlined below.

Note: To create predictive coding models in your project, you will need to have Admin or Create permission on prediction models. To view predictive coding prediction scores or access the model page, you will need to have at least Receive permission.

Document Prioritization

Building a predictive coding model will enable you to utilize its learnings to prioritize documents for document review. There are several ways in which you can access documents predicted to be relevant and prioritize accordingly.

From the Predictive Coding model

From your predictive coding model, you can find and select documents to prioritize with the action items panel and the distribution graph.

Using the Action Items Panel

At the top of your predictive coding model page you’ll find Action Items. These items will help you utilize predictive coding results to prioritize documents for document review.

Screen_Shot_2022-01-25_at_11.26.20_AM.png

💡 Tip: Your model will update every 24 to 48hrs. All documents reviewed up until the moment of the update will be included in the update, even if you selected the “Update Now” button previously. When your model is updated, your action items will update as well. For this reason, it is recommended that you check in on your predictive coding model page numerous times over the course of managing document review.

Once your model is active, you will see an action item to Prioritize. Prioritized documents are documents that have not yet been reviewed by your team that your model predicts to be relevant based on how you have reviewed other documents.

Screen_Shot_2022-01-25_at_11.27.36_AM.png

Select Review to see a results table of your recommended prioritized documents. From the results table, you can sort documents by prediction score and prioritize accordingly.

From your distribution graph

The distribution graph on your predictive coding model displays the documents the model should evaluate along a scale of predicted relevance. This set of documents is considered “reviewed” for the purposes of the model and is determined when creating a predictive coding model.

On the distribution graph you will notice a green, movable flag. By moving this green flag, you can set your own prediction threshold for reviewing your documents. For example, should you want to review documents that are predicted to be highly relevant as defined by your model’s criteria, you should slide the green flag further to the right. In the example below, the prediction score threshold has been set to 41.

In the top right corner of the distribution graph are the Reviewed and Unreviewed toggles. Select only the Unreviewed toggle to display the prediction score distribution of documents that have not yet been reviewed. From there, you can prioritize documents for review by selecting the blue number to the right of the prediction threshold. This will bring you to a results table of the documents that fall above the prediction threshold that you’ve selected.

💡 Tip: Learn more about the distribution graph in this Knowledge Base article on interpreting your predictive coding model results.Prediction scores correspond to likelihoods of relevance. A document that has a prediction score of 100 has a higher predicted likelihood of being relevant than a document with a prediction score of 85, but the document with a prediction score of 100 is not predicted to be more relevant than the document with the lower prediction score. You can use your performance metrics to better understand how your model is performing and select where to place your prediction threshold.

From the Results Table

From any results table, you have the option to view the predicted score of a document for any given model set up in your project that has begun to generate predictions. To do so, you’ll need to add a column for Prediction Score. From the Results Table toolbar, select View then Add or Remove column. You’ll find all of your models listed under the Prediction Columns category. Select the model most relevant to the document review for this set of documents. Once this column has been added to your results table, you can sort by prediction score by selecting to modify the table sort. This will enable you to view your documents by predicted relevance and prioritize document review. Screen_Shot_2022-01-25_at_11.37.41_AM.png

💡 Tip: You can prioritize document review using predictive coding within your existing document review workflows. If you are a Project Administrator, you can leverage the above workflow by setting a Results Table view that incorporates predictive coding and instructing reviewers to sort their document sets by predicted relevance.

From the Search Builder

Predictive Coding can also be leveraged whenever you are building a search. From the Search Builder, you can use the “Predicted” search term to build a search with predictive coding in mind. When selecting the “Predicted” term, you’ll be prompted to select the relevant predictive coding model and set your preferred range of prediction scores on a scale of 1 to 100. For example, the search shown below is for documents that the selected model (“Responsiveness”) has determined to have a prediction score of 80 or higher. In other words, this is a search for documents the selected model has predicted to be relevant as defined by the model, in this case, responsive.

Screen_Shot_2022-01-25_at_11.39.01_AM.png

By using the “Predicted” search term in combination with any other search term, predictive coding results can be leveraged for prioritization whenever you are building a search.

Using Clustering

Should clustering be enabled on your project, you can leverage clustering and predictive coding together to strengthen predictions and save time in review. Understanding where highly relevant or irrelevant documents are clustered can help you build a better understanding of your document set and can act as another tool to help prioritize certain sets of documents. To do so, select “Color documents by” followed by a specific prediction model. Once you make your selection, a legend will appear assigning a color to each predicted relevance range. Zoom and pan across the page to see which clusters of documents are currently predicted to be most likely to be relevant. Use the document selector to select and prioritize these documents for review. Screen_Shot_2022-01-25_at_11.49.44_AM.png

Updating Prediction Scores

All of the above workflows rely on the prediction scores that models generate. As you continue to review documents and your model updates every 24 to 48hours, the prediction scores of all documents will continue to update based on your review decisions. Given this, we recommend that when using one of the above methods to prioritize documents for assignment, you assign out small batches of documents to reviewers. We also recommend refreshing any searches built off of prediction scores as the model updates (about once every 48hours.)

For this reason, building out dynamic assignment groups may be a more efficient way to prioritize documents when using Predictive Coding to consistently build and manage your document review.

Building Dynamic Assignment Groups with Predictive Coding

Your predictive coding model can also be leveraged when managing document review. This is done by creating dynamic assignment groups optimized for review prioritization. By structuring a dynamic assignment group as explained below, you can find relevant documents sooner while limiting the review of irrelevant documents. To effectively use this workflow, you will need to have at least one model that has generated prediction scores.

💡 Tip: When creating dynamic assignment groups that leverage predictive coding, we recommend sorting documents by prediction score and assigning out documents in smaller batches. This will ensure that each new assignment pulled from the unassigned pool contains documents with the highest prediction scores based on the most recent review decisions/model update.

In the workflow outlined below, you will create an assignment from the Results Table instead of from the Assignments page. This is done to preserve the sort order of prediction scores in the newly created dynamic group. Learn more about creating assignment groups here.

Step 1: Build a search to determine your assignment group inclusion criteria

As you would when creating any other assignment group, you must first establish your inclusion criteria. Inclusion criteria determines which documents will be included in your assignment group. When building this assignment group, you want to build a search for unreviewed documents within a certain range of prediction scores from a selected model.

Build this search by selecting the “Predicted” search term and choosing the predictive coding model that you are using for document prioritization. Next, select the prediction score range for the documents you would like to review.

When creating this dynamic assignment in the earlier stages of document review, it is recommended that you set a broader prediction score range, such as 50 and above. Sorting documents by prediction score will ensure that reviewers see documents most likely to be relevant first while setting a lower prediction score range in the inclusion criteria will ensure that your reviewers see a broad range of documents early, improving your model’s coverage and performance.

However, should you be further along in your document review, it is recommended that you set a range that is based off of your F1 score and/or is higher, such as 80 and above. You can learn more about interpreting your distribution graph by reading our Knowledge Base article on interpreting your predictive coding results.

Note

The inclusion criteria does not update to the new F1 score if your F1 score changes. For example, if your F1 score changes from 80 to 85 after a model update, the inclusion criteria will remain at 80.

💡 Tip: If you have a large group of reviewable documents, it is recommended that you set your prediction score range lower so as to be more inclusive. For example, if you set your prediction range as 80 and above but have not reviewed many documents, there may be some documents with a predicted relevance between 60-79 that could be relevant but will not be assigned out for review. Getting a broader range of documents in front of reviewers early on in document review is key to improving your document coverage and consequently improving your model’s performance. As more documents are reviewed, your model’s predictions will improve. Therefore, as your team reviews more documents or if you are creating the dynamic assignment mid document review, you can use your F1 score to set your prediction threshold and/or set a higher prediction score range.

Finally, add the “Coded” search term, click the search term to turn it into “Not Coded,” and select the relevant code category that you will be reviewing for. This will ensure that you do not assign out documents that have already been reviewed. Below is an example inclusion criteria using scores from a predictive coding model based on responsiveness:

Screen_Shot_2022-01-25_at_11.50.40_AM.png

This inclusion criteria captures all the documents that have a prediction score of 50 or higher from the “Responsiveness Model” and have not been coded under the code category of “Responsiveness.” Once you’ve built your search, select the “Search” button to be taken to the results table.

Step 2: Sort your inclusion criteria in the Results Table by Prediction Score

Next, sort the documents in the results table by prediction score for the selected model. Sorting your inclusion criteria by prediction score ensures that each time documents are assigned out for review, documents with high prediction scores (more likely to be relevant) will be assigned first.

From the Results Table toolbar, select View then Add or Remove column. You’ll find all of your models listed under the Prediction Columns category. Select the model most relevant to the document review for this set of documents.

In the example below, we select the same model identified in our inclusion criteria (“Prediction Model: Responsive Model”).

Screen_Shot_2022-01-25_at_11.37.41_AM.png

Next, you want to sort the newly added Prediction Column so that prediction scores are listed in descending order. In order to modify the table sort to descending order, click the carat sort icon (triangles pointing up and down) for the Prediction Column and select “Single-column sort (desc)” from the drop-down menu.

Screen_Shot_2022-01-25_at_12.04.26_PM.png

Step 3: Create a new assignment group from the Results Table

From your sorted results table, you can create your dynamic assignment group. First, select the“Batch” icon from the result table toolbar and choose “Assign” from the drop down menu.

Screen_Shot_2022-01-25_at_12.05.35_PM.png
Once you click “Assign” in the drop-down menu, the assignments wizard will open and you can enter a new assignment group name. Next, you will be prompted to set the review criteria or conditions that the documents must meet to be considered reviewed. In the example below, the review criteria has been set as coded under the category for “Responsiveness,” which contains the codes “Responsive” and “Unresponsive.”

Screen_Shot_2022-01-25_at_12.06.03_PM.png

After selecting your reviewers, you will be prompted to configure the assignment. For this workflow, you should split the assignment group based on documents and select “Dynamic” as the type of assignment group.

💡 Tip: When creating this dynamic assignment, you have the option to set a default review window layout for your document reviewers. This enables you to tailor review tools to the needs of your reviewers, making document review more efficient. When reviewers first open their assignment from their assignment card, they will see the review window that you have set.

When allocating document batches, the default selection evenly splits all of the documents between the assigned reviewers. However, for this workflow, you should decrease the number of documents assigned to each reviewer to smaller increments (100-200 documents) so that more documents will remain in the unassigned pool.

By assigning out smaller sets of documents and keeping the majority of documents in the unassigned pool, as reviewers code documents and the model updates, the prediction scores of the documents contained in the unassigned pool will also update. This ensures that each new batch of documents that are assigned from the assignment group have prediction scores that are based on the most recent model update. In other words, reviewers will be assigned documents that are highly predicted to be relevant and will limit the review of documents that are predicted to be irrelevant.

Once you have completed your dynamic assignment group set up, select “assign.” You will receive a notification once the document group has been created, and/or the documents have been successfully assigned. The appropriate cards will be added to the home page and shared with your reviewers via the Message Center.

💡 Tip: Setting smaller batches of documents may mean that you will want to allow reviewers to self-assign documents from the unassigned pool. In order to do so, you should check the self-assign box and select a smaller batch size for self-assignment increments.

Using Predictive Coding to Quality Control your Document Review:

Predictive coding learning can also be leveraged within your existing document review workflows to conduct quality control on your document review.

From your predictive coding model page

On your predictive coding model page, your Action Items panel will prompt you to Assess Conflict. This will point to any conflicts the model has detected. Conflicts are any documents the model has predicted to be relevant but that your review team have reviewed to be irrelevant as defined by your model and vice versa.

Selecting Review will take you to a results table of the documents. Assessing these conflicts will allow you to conduct basic quality control on your document review and improve the model’s performance. You should check back in on this action periodically as you conduct document review. As your model continues to learn and documents are reviewed by your team, your conflicts will update.

Screen_Shot_2022-01-25_at_12.06.44_PM.png

From the Clustering Page

Color coding individual documents by their predicted relevance from the clustering toolbar can also allow you to conduct quality control on certain sets of documents given their prediction scores. Select the “Color documents by” in the toolbar followed by the prediction model you’d like to view results for to color code your documents. You can utilize the predictive coding overlay to identify potentially uncoded or incorrectly coded documents.

Predicted relevance inconsistencies within clusters may merit a closer look. You can select a subsection of a cluster, or select specific documents across multiple clusters, by using “document selection mode” in the toolbar. Click and drag to select the section of interest, and open a results table of those documents to see why the outliers are coded differently.

Using Linked Assignments

If you would like to build quality control into the review decisions made on assigned documents, you can create and link another dynamic assignment group to act as a second pass-review. To learn more, read our Knowledge Base article on linking assignments.

For example, you may want to review documents predicted relevant but reviewed irrelevant or predicted irrelevant but reviewed relevant. To do so, you can also create a dynamic assignment group.

Below is an example of an inclusion criteria created to review documents that were coded responsive but were predicted by our model to be non-responsive:

Screen_Shot_2022-01-25_at_12.07.39_PM.png