eDiscovery, Ethics, and the Case for AI

May 20, 2021



Sarah Moran
Sarah Moran

Ever since ABA Model Rule of Professional Conduct 1.1 [1] was modified in 2012 to include an ethical obligation for attorneys to “keep abreast of changes in the law and its practice, including the benefits and risks associated with relevant technology [2]” (emphasis added), attorneys in almost every state have had a duty to stay abreast of how technology can both help and harm clients. In other words, most attorneys practicing law in the United States have an ethical obligation to not only understand the risks created by the technology we use in our practice (think data breaches, data security, etc.), but also to keep abreast of technology that may benefit our practice.

Nowhere is this obligation more implicated than within the eDiscovery realm. We live in a digital world and our communications and workplaces reflect that. Almost any discovery request today will involve preserving, collecting, reviewing, and producing electronically stored information (ESI) – emails, text messages, video footage, Word documents, Excels, PowerPoints, social media posts, collaboration tool data – the list is endless. To respond to ESI discovery requests, attorneys need to use (or in many cases, hire someone who can use) technology for every step of the eDiscovery process – from preservation to production. Under Model Rule 1.1, that means that we must stay abreast of that technology, as well as any other technology that may be beneficial to completing those tasks more effectively for our clients (whether we are providing legal advice to an organization as in-house counsel or externally through a law firm).

In this post, I posit that in the very near future, this ethical obligation should include a duty to understand and evaluate the benefits of leveraging Artificial Intelligence (AI) during almost any eDiscovery matter, for a variety of different use cases.

AI in eDiscovery

First, let’s level set by defining the type of technology I’m referring to when I use the term “AI,” as well as take a brief look at how AI technology is currently being used within the eDiscovery space. Broadly speaking, AI refers to the capability of a machine to imitate intelligent human behavior. Within eDiscovery, the term is often also used broadly to refer to any technology that can perform document review tasks that would normally require human analysis and/or review.

There is a wide range of AI technology that can help perform document review tasks. These include everything from older forms of machine learning technology that can analyze the text of a document and compare it to the decisions made about that document by a human to predict what the human decision would be on other documents to newer generations of analytics technology that can analyze metadata and language used within documents to identify complicated concepts, like the sentiment and tone of the author. This broad spectrum of technology can be incredibly beneficial in a number of important document review use cases – the most common of which I have outlined below:

  • Culling Data - One of the most common use cases for AI technology within eDiscovery is leveraging it to identify documents that are relevant to the discovery request and need to be produced. Or, conversely, identify documents that are irrelevant to the matter at hand and do not need to be produced. AI technology is especially proficient at identifying documents that are highly unlikely to be responsive to the discovery request. In turn, this helps attorneys and legal technologists “cull” datasets, essentially eliminating the need to have a human review every document in the dataset. Newer AI technology is also better at identifying documents that would never be responsive to any document request (i.e., “junk” documents) so that these documents can be quickly removed from the review queue. More advanced AI technology can do this by aggregating previously collected data from within an organization as well as the attorney decisions made about that data, and then use advanced algorithms to analyze the language, text, metadata, and previous attorney decisions to identify objectively non-responsive junk documents that are pulled into discovery request collections time and time again.
  • Prioritizing and Categorizing Data - Apart from culling data, AI can also be used to simply make human review more efficient. Advanced AI technology can be used to identify specific concepts and issues that attorneys are looking for within a dataset and group them to expedite and prioritize attorney review. For example, if a litigation involves an employee accused of stealing company information, advanced AI technology can analyze all the employee’s communications and digital activities and identify any anomalies, such as an activity that occurred during abnormal work hours or communications with other employees with whom they normally would not have reason to interact. The machine can then group those documents so that attorneys can review them first. This identification and prioritization can be critical in evaluating the matter as a whole, as well as helping attorneys make better strategic decisions about the matter. Review prioritization can also simply help meet court-imposed production deadlines on time by enabling human reviewers to focus on data that can go out the door quickly (i.e., documents that the machine identified as highly likely to be responsive but also highly unlikely to involve issues that would require more in-depth human review like privilege, confidentiality, etc.).
  • Identifying Sensitive Information - On the same note, AI technology is now more adept at identifying issues that usually require more in-depth human review. Newer AI technology that uses advanced Natural Language Processing (NLP) and analyzes both the metadata and text of a document is much better at identifying documents that contain sensitive information, like attorney-client privileged communications, company trade secrets, or personally identifiable information (PII). This is because more advanced NLP can take context into account and, therefore, more accurately identify when an internal attorney is chatting with other employees over email about the company fantasy football rankings vs. when they are providing actual legal advice about a work-related matter. It can do this by analyzing not only the language being used within the data, but also how attorneys are using that language and with whom. In turn, this helps attorneys conducting eDiscovery reviews prioritize documents for review, expedite productions, and protect privileged information.

Attorneys’ Ethical Obligation to Consider the Benefits of AI in eDiscovery

The benefits of AI in eDiscovery should now be clear. It is already infeasible to conduct a solely human linear review of terabytes of data without the help of AI technology to cull and/or prioritize data. A review of that amount of data (performed by humans reviewing one document at a time) can require months and even years, a virtual army of human reviewers (all being paid at an hourly rate), as well as the training, resources, and technology necessary for those reviewers to perform the work proficiently. Because of this, AI technology (via technology assisted review (TAR)) has been widely accepted by courts and used by counsel to cull and prioritize large sets for almost a decade.

However, while big datasets involving terabytes of data were once the outliers in the eDiscovery world, they are now quickly becoming the norm for organizations and litigations of all sizes due to exploding data volumes. To put the growing size of organizational data in context, the total volume of data being generated and consumed has increased from 33 zettabytes worldwide in 2018 to a predicted 175 zettabytes in 2025[3]. This means that soon, even the smallest litigation or investigation may involve terabytes of data to review. In turn, that means that AI technology will be critical for almost any litigation involving a discovery component.

And that means that we as attorneys will have an ethical duty to keep abreast of AI technology to competently represent our clients in matters involving eDiscovery. As we have seen above, there is just no way to conduct massive document reviews without the help of AI technology. Moreover, the imperative task of protecting sensitive client data like attorney-client privilege, trade secret information, and PII (which all can be hidden and hard to find amongst massive amounts of data) also benefits from leveraging AI technology. If there is technology readily available that can lower attorney costs and client risk, while ensuring a more consistent and accurate work product, we have a duty to our clients to stay aware of that technology and understand how and when to leverage it.

But this ethical obligation should not scare us as attorneys and it doesn’t mean that every attorney will need to become a data scientist in order to ethically practice law in the future. Rather, it just means that we, as attorneys, will just need to develop a baseline knowledge of AI technology when conducting eDiscovery so that we can effectively evaluate when and how to leverage it for our clients, as well as when and how to partner with appropriate eDiscovery providers that can provide the requisite training and assist with leveraging the best technology for each eDiscovery task.


As attorneys, we have all adapted to new technology as our world and our clients have evolved. In the last decade or so, we have moved from Xerox and fax machines to e-filings and Zoom court hearings. The same ethic that drives us to evolve with our clients and competently represent them to the best of our ability will continue to drive us to stay abreast of the exciting changes happening around AI technology within the eDiscovery space.

To discuss this topic more, feel free to connect with me at smoran@lighthouseglobal.com.

[1] “Client-Lawyer Relationship: A lawyer shall provide competent representation to a client. Competent representation requires the legal knowledge, skill, thoroughness and preparation reasonably necessary for the representation.” ABA Model Rules of Professional Conduct, Rule 1.1.

[2] See Comment 8, Model Rules of Professional Conduct Rule 1.1 (Competence)

[3] Reinsel, David; Gantz, John; Rydning, John. “The Digitization of the World From Edge to Core.” November 2018. Retrieved from https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf. An IDC White Paper, Sponsored by SEAGATE.

About the Author

Sarah Moran

Sarah is a Director of Marketing at Lighthouse. Before coming to Lighthouse, she worked for a decade as a practicing attorney at a global law firm, specializing in eDiscovery counseling and case management, data privacy, and information governance. At Lighthouse, she happily utilizes her eDiscovery expertise to help our clients understand and leverage the ever-changing world of legal technology and data governance. She is a problem solver and a collaborator and welcomes any chance to discuss customer pain points in eDiscovery. Sarah earned her B.A. in English from Penn State University and her J.D. from Delaware Law School.