Linguistic Modeling: A Secret Superpower to Boost Your eDiscovery Search

September 17, 2024

|

By:

Amanda Jones
Amanda Jones

Get the latest insights

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Summary: Linguistic modeling complements AI as a powerful tool for extracting meaning from vast amounts of text. Using these models, linguists can develop supercharged search that is faster and more accurate compared to other methods.

AI gets a lot of hype, but it isn’t the only way to quickly find meaning in large amounts of text. Many rely on traditional search terms, visualizations, hot doc coding during document review—with some venturing into the use of AI for this purpose.

For instances where scale and speed are imperative, linguistic modeling offers extremely nuanced and targeted document searches—using a tool built for linguistic search, unlocking the full potential of linguists in eDiscovery.

As a legal linguist myself, I’ve been helping legal teams take advantage of linguistic models for well over 15 years.

Here’s a breakdown of what it is, when to utilize it, and how it complements, rather than competes with, modern AI and other solutions you use.

Linguistic modeling = special tech + linguistic specialists

Think of linguistic modeling as a supercharged form of search. But it isn’t merely an upgraded version of the keywords and methods that are used with common analytics tools such as dtSearch. It’s a different type of search altogether.

Linguistic modeling enables highly targeted document searches that find relevant documents and information much more accurately than conventional search and more quickly than eyes-on review.

Linguistic-based search has two components:

  • Linguistic experts who design complex combinations of searches
  • Technology that is purpose-built to implement those complex designs

These “complex combinations of searches” can include thousands of conceptual keywords, with a dizzying array of rules and conditions specifying not only what should be captured but also what should be avoided. The rules and conditions incorporated in linguistic models encode nuanced requirements regarding syntactic relationships, conversational context, data type, and more. In fact, a single complex search can do the work of hundreds of thousands of conventional keywords.  

This is where linguistic search gets its power: The more nuance we can build into your search, the more accurate and valuable your results will be.

You might have experienced this yourself when typing keywords into a search engine or prompts into a generative AI platform. You will get results even if you use basic keyword searches or general questions, but your results will be more focused, relevant, and actionable if you provide context and specific direction. That’s exactly what legal linguists provide when building a linguistic model.

The process is iterative and collaborative

To add as much context as possible, linguistic modeling is a learning process that requires testing and refinement of not just terms, but how people speak within a company, industry, or when engaged in certain types of activities.

First, linguistic experts are briefed on case goals and subject matter, so we can craft our initial searches. Then we test those searches, review results with the client, and refine and expand their searches based on the feedback they receive.

This workflow creates an ideal partnership—linguists leverage their specialized skill set to build an effective model, while attorneys focus on case strategy, overseeing the modeling process to ensure that it meets their legal obligations and information needs.

This also means that as future needs arise, our team is already prepped and ready to jump back in.

Use cases of linguistic search

At Lighthouse, we use linguistic modeling to improve the accuracy, speed, and insights gained from a range of tasks, including:  

  • Culling before review—Search terms are considered a standard in culling documents going to review. Why not take advantage of the most sophisticated search terms that you can get?
  • Case preparation—With linguistic search, you don’t have to wait until reviewers have looked through every document to know what’s in them. Linguistic modeling speeds key document outputs to during or even before those steps.
  • Early case assessment—See if there’s a “there” there before you get too deep into a matter.  
  • Narrative development—Quickly surface documents that support the narrative for the case you want to make.  
  • Inbound production analysis—Understand the facts within the documents produced by opposing counsel.  
  • Deposition preparation—Find all the documents likely to come up in interviews so that deponents can study them in advance.  

Is linguistic modeling an alternative to AI?

Linguistic modeling and modern AI both offer superpowered language analysis, but they’re not rivals. They are complementary tools to sit alongside each other in your toolset. Choosing when and how to use them depends on many factors, including scale, cost, and timelines. Your eDiscovery partner will explain your best option for each matter.

In many cases, linguistic modeling can be used in combination with modern AI. For example, we can leverage linguistic modeling to identify more effective training data for AI classifiers or to refine and QC the results of generative AI outputs. In these instances, the combination of solutions leads to outcomes that are superior to what could be achieved by either approach alone.

See linguistic modeling in action

Curious about real-world results?

See how linguistic search has helped legal teams save time and pinpoint key information, such as:

And discover more ways organizations are enhancing their eDiscovery with Lighthouse AI.

About the Author

Amanda Jones

Amanda is a Director in Lighthouse's Research, Modeling, and Analysis group. She supervises the development of new processes and offerings for eDiscovery, designing and implementing innovative linguistic and statistical approaches to document classification. Amanda has over 15 years of experience applying advanced strategies and tactics to complex litigation-related information retrieval projects. She has collaborated extensively with corporate legal departments and outside counsel to formulate and validate defensible document review protocols. Before joining the company, she oversaw Technology Assisted Review and Search Consulting at Xerox Litigation Services. Her work has been published in Forbes, National Law Review, Metropolitan Corporate Counsel, and the proceedings of the fourth, fifth, and sixth Discovery of Electronically Stored Information workshops held in conjunction with the International Conference on Artificial Intelligence & Law. Amanda holds a B.A. in linguistics from the University of Texas at Austin and an M.A. in linguistics from the University of California, Los Angeles.