Are GenAI Prompts Discoverable? Three Misunderstandings About This Emerging Data Type
November 20, 2025
By:
Summary: Generative AI tools are creating new data types, such as prompts, outputs, and logs, that are quickly becoming relevant in investigations and litigation. This piece breaks down the top misunderstandings about GenAI discoverability and offers practical guidance for legal teams and attorneys preparing for what’s next.
Generative AI is now woven into everyday business operations (and personal ones, too). From doc drafting, decision support, and consumer interactions to data-driven strategy assistance and modeling. It makes sense, then, that when an investigation or possible litigation is on the horizon, GenAI data, which includes prompts, responses, generated drafts, and logs, can be both potentially relevant and collectible.
While research tools like Microsoft Copilot, ChatGPT Enterprise, and Google Gemini offer sometimes exponential efficiency gains, the resulting data footprint introduces novel challenges to traditional eDiscovery processes. And as our understanding of the artifacts of generative and hybrid-model AI grows along with uptake in use of the tools, a critical eDiscovery question emerges: What happens to the prompts and responses generated in the course of using a generative AI tool, in litigation?
Common assumption: Generative AI data isn’t discoverable because it’s not communication with a person or persons.
Reality: Just because AI interactions aren’t human-to-human, doesn’t mean they are automatically not relevant. Courts are beginning to treat these interactions as analogous to other forms of internal communications, like emails or instant messages. The necessity to produce is based on relevance to the dispute.
Further, in chatbot-like generative AI scenarios, a series of interchanges between user and AI often reveals a chain of thought, where the user is building on the response to their first question to ask their next question, and so on. If this sounds familiar—like instant messaging or email—the similarities don’t end there: Many mainstream apps, like Copilot and ChatGPT, store interactions for later export.
If it’s relevant and proportional to discover, assume it’s discoverable. Now is the perfect time for compliance teams to get ahead of preservation notices by partnering with IT to inventory GenAI tools, both enterprise approved and shadow-implemented, confirm where logs live, default retention, export paths, and who owns administration.
Common assumption: Attorney work product is privileged, and a shield for lawyer-generated prompts.
Reality: In a recent decision, Tremblay v Open AI, Inc, a precedent was set compelling comprehensive production of plaintiff work product within ChatGPT, not only the “cherry-picked” successful results that supported their claims of OpenAI using copyrighted works without permission.
Privilege depends on how and where the tools are used. Interacting with public or open systems heightens waiver risk, and while enterprise deployments with proper contracts offer a layer of privacy and protection, it’s not guaranteed that work product will be waived from discoverability.
Once again, proportionality becomes paramount. In the above Tremblay ruling, the plaintiffs clearly had access to work product tests out of ChatGPT, so producing negative tests alongside positive ones would have been proportional to the scope of the matter. It behooves practitioners, then, to treat prompts and outputs like other ESI: keep use segregated (business use is not the same as legal research, for example) and avoid mingling privileged prompts in public tools.
Common assumption: It’s impossible to collect or review GenAI data at scale.
Reality: Thinking that GenAI interactions are too novel, fragmented, voluminous, and different from traditional file types to be collected or reviewed in a proportional manner is not quite accurate (and may also be wishful thinking).
In fact, preservation and collection workflows often look similar for other modern data types. Most enterprise-grade generative AI platforms, or those integrated via APIs, generate administrative and audit trail artifacts—the practical equivalent of chat metadata.
Service and tech providers like Lighthouse also have access to a wide breadth of tools, custom scripts, and tiger teams of solution experts that can bridge the gap between novel and ready for review. It’s imperative to remain informed about legal precedent and the expanding guidance on modern data from bodies such as the Sedona Conference, to ensure we devise solutions that address not only the data that practitioners manage today, but also what’s going to fill the data horizon tomorrow.
AI as discoverable data: anywhere is a good place to start
Use this practical set of questions to get a handle on where GenAI data artifacts may exist and how you can account for them at every stage of the data lifecycle:
- Sources: Which GenAI tools are being used in your organization? Which do you know about? Which might you be unaware of, and how can you suss them out?
- Retention: What are your current policies for GenAI tools? Do they differ by tool? What are your legal hold procedures for prompts, responses, and logs from GenAI?
- Collection: What is your understanding of how individual apps store and log data?
- Privilege: How will your review team identify and protect attorney work product embedded in user prompts?
- Processing and Review: How will you handle the standardization of chat logs that are a mix of personal and business inquiries, or that take place over a period of time?
Want to go deeper on how Lighthouse approaches AI in eDiscovery and beyond? Visit our AI page to explore our expertise, methodologies, and the innovations shaping what’s next for legal teams.



