"More Options" in Search: Duplicates, Sampling, Grouping, and Removal

Watch a video about More Options:

 

Table of Contents

Introduction

The More Options link allows you to create powerful searches, particularly when it comes to searching by families. Within a logical container, you can click More Options to include duplicates, group by families, email threads, and versions, filter grouped documents, or take a random sample of your search. You can explore example use cases at the end of this article.

Click More Options in the bottom right corner of a logical operator. A dialog box will appear, where you can choose to:

  • Include or exclude duplicates
  • Take a random sample of your search
  • Group your search by families, email threads, exact duplicates, or versions
  • Remove documents from a grouped search

 moreoptions.gif

Once you make your selections, click Save. Resulting selections will appear as chevrons below the search container to which they were applied. You can read your search options from right to left.

Every search follows the same order of operations:

Negation →  Duplicates → Sampling → Grouping → Filtering

To understand how to use the NOT term and negate searches, see this article.

The More Options link will appear in the bottom right corner of each logical container. Any individual option can be applied to multiple logical containers within a search, with the exception of the Duplicates option, which can only be applied to the outermost container. (See Duplicates section for more detail.)  Here is an example below to illustrate:

2018-06-11_12-59-23.png

In this search, emails from “Steven J Kean” are grouped by attachments with the children removed (and parent emails remaining). The outermost container excludes 4,000 duplicates from the overall search. Because duplicate inclusion or exclusion can only be applied to the outermost container, duplicates are removed after the inner search terms and logic are applied.

Each selection in More Options can only be applied to any one logical operator once. Everlaw will display document grouping in your results table according to what is specified on the outermost logical container. For instance, in the example below, the documents have been grouped by email threads, and then exact duplicates.

email_thread_then_dupes.png

All four documents in the screenshot are part of the same email thread, and appear consecutively in the list of search results. However, the documents are not actually grouped by thread. Instead, each document is grouped with their exact duplicates. Although grouping by email thread determined the order of the documents in the results table, grouping by duplicates determined which grouping structure the documents would display.

A good way to double-check the logic of your search is via the instant search preview. The grey bar will display the order of operations conducted in your search.

search_preview.png

The rest of this article will describe the various options that you can apply to your search. 

Return to table of contents

Duplicates

To understand the definition of duplicates on the Everlaw system, please visit this article.

You can display duplicate documents among search results by selecting Include Duplicates in the More Options tab. Depending on your project, search deduplication may be turned off by default. If this is the case, the option to exclude or include duplicates will not appear for you. Contact support@everlaw.com or your project administrator if you would like search deduplication to be turned on for your project.

Note that if you choose to exclude duplicates through this toggle, you will omit any document that is marked as a duplicate of any other document on your project. It will not deduplicate solely within the context of your search results. To learn how to deduplicate documents within the context of your search, see “Remove children” in the Removal section.

Note that the option to include or exclude duplicates, if permitted on your project, will only be available on the outermost container. In other words, you will not be able to exclude duplicates from a portion of your search and include them for another portion.

As an exception to the duplicate rule noted above, documents matching your search criteria that have been coded with any code, have a note applied to them, and/or have a hot or warm rating will be included in your search results, regardless of duplicate status. In other words, a duplicate document with any of the three characteristics described above will show up in the search results without needing to explicitly include duplicates in the More Options menu.  

Return to table of contents

Sampling

You can choose to see only a randomly sampled subset of your search results for any given search. Click the More Options link and in the dialog box, choose a percentage of documents to sample. This sampling will apply to all documents captured by the conditions set in the logical container.

search59.png

You can apply sampling to multiple logical containers.

double_sample.png

Document sampling operations will always be applied before grouping or filtering decisions, based on the order operations of search. In other words, if you choose to sample your documents and also group them by email thread, your documents will be sampled before they are organized into email threads. This prevents partial email threads from appearing in your results table.

Here are a couple of examples of situations in which sampling may be useful for your team:

  • Triaging Review: Let’s say you receive thousands of documents of a particular custodian. You can use the sample feature to take a randomly-sampled subset of the documents from the custodian set, say 10%. If a review of the sampled documents show that only 5% of the sample is relevant, you may then decide you’d like to allocate your resources elsewhere. Alternatively, the sample documents could show that 60% of the sample is relevant, and you may want to further review that custodian.
  • Training the predictive coding engine: Training the prediction engine with randomly sampled subsets of documents may help improve the precision and recall of the generated predictions.

 

Return to table of contents

Grouping

For an introduction to the concept of grouping, please see this article.

On Everlaw, document groups include attachment groups, email threads, exact duplicate families, and versions. Below is a definition of each grouping type:

  • Attachments: Documents in an attachment family. Includes the parent document, often an email, and its attachments.
  • Email Threads: Emails that comprise an email thread, including replies, reply all, and forwarded emails.  Grouping by email thread will also include attachments.
    • Note: you cannot group by email thread and then filter out the parent documents. This prevents attachments from being displayed without their associated email parents.
  • Exact Duplicates: Duplicate versions of the document. A complete definition of duplicates is in this article.
  • Versions: Versions of the same document (produced and pre-produced, translated and untranslated, etc.)

To group your search, click More Options. In the dialog box, select one of the above grouping options. When selected, options under Remove from Group will appear. To learn about removing hits from grouped searches, see the section below. When you are happy with your grouping selection, click Save. Your search results will now include related documents as a result of grouping.

The search below is grouped by attachments. It returns all emails containing the phrase “confidential information,” in addition to those emails’ attachments.

group_attachment.gif

 

Return to table of contents

Remove from Group

You can remove hits from your grouped search in the More Options dialog box. When selecting a grouping option, additional filtering options will appear. The filter you select will have an effect on how the documents are represented in the results table. There are are five options for grouping removal: none (keep all), remove parent, remove children, remove searched hits, remove grouped non-hits.

grouping.png

Below is a description of each type of filter.

None (Keep all)

This keeps all original search hits and grouped documents. Documents which were pulled in by grouping will have the same status as direct search hits in the results table. All documents will be represented in black non-italicized text and will be included when navigating to the next document.

Remove Parent

This filter removes the parent of every document group in the results. If you apply it to the outermost logical operator of your search, the results table will still include the parent documents in your results table. This is to preserve the family relations and prevent confusion about which documents belong to which family. However, the parent documents will be greyed out in the results table, meaning that when you navigate through your list of results via the review window, it will skip over the parents. The parents will also not be affected by any export, batch modify, or production actions. 

parentless.png

Note: The Remove Parents filter will be disabled when grouping by Email Threads so that attachments are not displayed without their associated email parents.

Remove Children

This filter removes every child document from each document group in the result set. In the results table, there will be no grouped structure or parent documents. All documents will be represented in black non-italicized text and will be included when navigating to the next document.

group_attachment_parents_only.png

Removing children is an easy strategy you can use to manually deduplicate a search. This option is preferable to the toggle that allows you to "include or exclude duplicates". This is because the toggle will exclude any document that is a duplicate of any other document on the project, not just documents that are duplicates of other search results. To deduplicate within your search results, group your documents by exact duplicates, and select the option to remove children.

 exact_dupes_children__1_.png

This will yield a list of search results that contains one copy of every document recalled by your search criteria.

Remove Search Hits

This filter removes any documents, after grouping, that match your search criteria according to the  corresponding logical container. In the below example, the search returns all documents in the same attachment group as a document that contains “fraud,” but that do not contain the word "fraud” themselves.

grouped_searchhits.png

An application of this functionality might be to search for document groups coded inconsistently for responsiveness. The search below would search for attachment families and remove any documents in the family already coded as responsive. The result would be remaining documents in the family not coded responsive.

Remove Grouped Non-Hits

This filter removes any documents that were included as a result of grouping but that do not meet the rest of the search criteria specified in the logical container, meaning that it only keeps the documents that actually match your search criteria. The below search would group documents containing the word "oil", keeping only the documents (parents or children) that contain the word oil. 

grouped_nonhits.png

Remove Non-Inclusive Emails [for complete projects only]

This filtering option is only visible for complete projects, and you can only select it when grouping by email threads. Using this filter will keep inclusive emails (and associated attachments) that match the original search criteria. It will also keep inclusive emails and associated attachments that were pulled in when grouping that logical term by email threads. Finally, the filter will keep all documents matching the original search criteria that are neither emails nor email attachments.

The below example searches for all emails, removing the non-inclusive ones,  that contain the words “smoking” and “gun” within 20 words of each other.

noninclusive_emails.png

An inclusive email is the most “complete” email thread within the branch. It contains all content for the entire email thread. As a result, it is often last email in the branch, and all previous emails should appear in the body of the document.

Return to table of contents

Example use cases

Pre-production QA

You are interested in doing QA on your responsive documents before producing them. You’d like to see which documents, including their family members, are not coded as privileged.

preprod_qa.png

The above search allows you to easily check for coding inconsistencies before running a production and assign them for review. The above search identifies documents marked for production that have not been coded for privilege.

Ignoring a set of documents to be reviewed elsewhere

Let’s imagine, for example, that you’re conducting two stages of review of foreign language documents.

This search looks for documents not in Italian (also excluding their attachments). Among remaining documents, it searches for documents containing the phrase “high priority."

italian_docs.png

Running a search for documents with attachments

You might need to search for only the documents that have attachments, or some other type of grouping. Grouping your search, by attachments for example, will also include documents that do not include a document group. To only return the documents with a group, remove the parents.

emails_w_attachments.png

The above search returns only emails that include attachments.

Note that in your results table, parent emails will appear in purple italicized text to illustrate that they are part of the family. However, when navigating from document to document in the review window, the italicized documents will be skipped over.

Grouping by multiple parameters 

You may wish to group first by one parameter and then another, say, email threading then exact duplicates. If you wanted to build a search that incorporated both of those grouping parameters, the resulting search would be formatted like this:

double_grouping.png

Exclude duplicates while keeping at least one copy of every document

To deduplicate within your search results, group your documents by exact duplicates, and select the option to remove children.

 exact_dupes_children__1_.png

This will yield a list of search results that contains one copy of every document recalled by your search criteria.

Return to table of contents

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.