Advanced Content Searches (Wildcard, Proximity, Fuzzy, Regular Expression)

 

Table of Contents

 

Intro

You can conduct wildcard, fuzzy, and proximity searches. The content search term also supports regular expression. This article will go over search syntax for advanced content searches on Everlaw. 

Return to table of contents

Wildcard Searches

Everlaw supports single and multiple character wildcard searches for content in documents or metadata (e.g. file path, custodian, etc.): “?” for single character; “*” for single and multiple characters. Below are some examples for how you can construct wildcard searches. 

Searching for words starting with certain characters, append “?” or “*” at the end of the word, e.g., 

  • Rela?  will find words such as relax and relay
  • Rela*  will find words such as relax, relay, relaxing, relate, and related 

Searching for words starting and ending with specified characters, use “?” or “*” in the middle of the word, e.g., 

  • re?t  will find words such as rent and rest
  • re*t  will find words such as rent, rest, receipt and relevant 

Searching for words ending with specified characters, prepend “.?” or “.*” and wrap the word in “//”:

  • /.?oat/  will find words such as boat and goat
  • /.*oat/  will find words such as boat, goat, throat and float 

Searching for words with specified characters in the middle, prepend and append “.?” or “.*”, and wrap the word in “//”

  • /.?oa.?/  will find words such as load and loan
  • /.*oa.*/  will find words such as load, loan, coats and floating 

Return to table of contents

Fuzzy Searches

Everlaw supports fuzzy searches, which finds similar words. To do a fuzzy search, use the tilde symbol "~" at the end of a single word term. Fuzzy searches are a good way to find documents with possible misspellings of words or names.

For example, to search for a term similar in spelling to "rise" use the fuzzy search: rise~. This search will find terms like "risk" and "rises".

An additional (optional) parameter can be used to specify the required similarity threshold. The value of the parameter is between 0 and 1, not inclusive. A value closer to 1 signifies higher similarity: rise~0.8

The default parameter is .5 if no other value is specified.

Return to table of contents

Proximity Searches

Everlaw supports finding words that are within a specific distance away from each other. To perform a proximity search between two words, put the two words in quotation marks. Then, use the tilde symbol "~" at the end of a list of words you want to search for enclosed with quotation marks. Third, specify a word distance.

For example, let's say we want to search for the word cumulative and the word assessment within 3 words of each other (i.e., with no more than 3 words between them) in a document. The search would be represented like this:

"cumulative assessment"~3

We have put the two words within quotations (cumulative and assessment), added a tilde (~), then specified the word distance (3). This will search for all instances where cumulative and assessment have, at most, 3 words between them. This searches for the words in either order of the document, meaning that assessment can come before cumulative even though it is listed after in the search.

You can also do proximity searches with phrases. Please note that, in addition to being contained in quotation marks, phrases in proximity searches must be surrounded by parentheses. “cookie (“chocolate chip”)”~20 is a correctly formatted search, while “cookie “chocolate chip””~20 is not. Some additional examples are below:

  • “jelly (“peanut butter”)”~30
    • This search retrieves results for jelly within 30 words of “peanut butter.”
  • “(sandwich* cook*) (jelly “peanut butter”)”~30
    • This search can be read as "sandwich* OR cook* within 30 words of jelly OR "peanut butter."" It will retrieve results for any or all of the following:
      • sandwich* within 30 words of jelly (“sandwich* jelly”~30)
      • cook* within 30 words of jelly (“cook* jelly”~30)
      • sandwich* within 30 words of “peanut butter” (“sandwich* (“peanut butter”)”~30)
      • cook* within 30 words of “peanut butter” (“cook* (“peanut butter”)”~30)

In Everlaw's query builder, this search would look like this:

proximity_search_2.png

This will yield the same results as a search that looks like this:

proximity_search_1.png

  • “sandwich* cook* (jelly “peanut butter”)”~30
    • This search requires all three clauses (sandwich*, cook*, and (jelly OR “peanut butter”)) to appear together within 30 extra words at most. The search retrieves documents for which all of the following are true:
      • sandwich* is within 30 words of jelly OR “peanut butter”
      • cook* is within 30 words of jelly OR “peanut butter”
      • sandwich* is within 30 words of cook*

You can also perform nested proximity searches. For example, the search "sandwich ("ham cheese"~10)"~20 will look for the word "sandwich" within 20 words of every instance where "ham" and "cheese" occur within 10 words of each other. 

Return to table of contents

Regular Expressions

Regular expressions allow you to search for text strings that match a certain pattern of characters. You can use regular expressions in Everlaw search to retrieve common patterns of personally identifying information, such as Social Security or credit card numbers, without having to actually memorize the numbers themselves.

Below are some examples of regular expression searches you can use in Everlaw:

Social Security numbers:

/[0-9]{3}/ /[0-9]{2}/ /[0-9]{4}/" OR "xxx xx /[0-9]{4}/

Credit Card numbers:

/[0-9]{4}/ /[0-9]{4}/ /[0-9]{4}/ /[0-9]{2,4}/

Phone numbers:

/[0-9]{3}/ /[0-9]{3}/ /[0-9]{4}/

Email addresses:

/[-A-Za-z0-9._%+]+[@][-A-Za-z0-9.]+[.][A-Za-z]{2,4}/

The use of regular expressions isn't limited to numeric contexts. For example, you can use regular expressions to search for words whose spelling might vary by one letter:

  • /[bcg]oat/  will find boat, coat, or goat. Either b, c, or g could occupy the first spot in the text string.
  • /gr[ae]y/ will find grey or gray. Either a or e could occupy the third spot in the text string. 

Return to table of contents

Limiting your search based on context


You can also search for a certain word or phrase, while excluding specific contexts in which the word or phrase may occur. To do this, use NOT within your contents search. For example, if you want to do a wildcard search for “proto*” but don’t want your search to return any variant of “protocol”, you can build a search that looks like this:

proto_NOT.png

You can do the same for phrase searches. For example, if you want emails with the phrase “secret meeting” in them, but do not want emails returned where “secret meeting” only appears within the phrase “I have no interest in a secret meeting”, you can build a search that looks like this:

secret_meeting_NOT_no_secret_meeting.png

 

Limiting your search based on context also works for regular expression searches. For example, if you are looking for documents that contain the year 2017 in their contents, but you do not want your results to yield instances of “2017” that are part of phone numbers, you could construct a search that looks like this:


2017_NOT_phone_number.png

 

Return to table of contents

Have more questions? Submit a request

0 Comments

Article is closed for comments.