The Not So Hidden Risks of Keeping Too Much Data
March 19, 2026
By:
Summary: Data retention is no longer just a storage issue. Here’s what legal leaders are seeing as over retention, AI, and data governance collide.
At Legalweek, Lighthouse hosted a panel discussion with legal, eDiscovery, and legal operations leaders on a challenge that is getting harder to ignore: organizations are keeping too much data for too long.
That is no longer just a records or governance problem. It is a legal, operational, and risk issue. As data volumes grow, over retention drives up storage costs, expands discovery scope, increases regulatory and privacy exposure, and creates more complexity when litigation or investigations arise.
Here are the biggest takeaways from the session.
1. Over retention is widespread, and it’s expensive
The panel was aligned from the start: data over retention is real, and for many organizations, it is chronic.
Sometimes the cause is an outdated retention schedule. Sometimes it is exacerbated by fears of deleting data that might one day be useful. And sometimes it is simply that new technologies are creating data faster than governance programs can keep up.
Whatever the cause, the impact is broad. The more data an organization keeps unnecessarily, the more data it needs to secure and govern (and potentially preserve, collect, review, and produce in the event of litigation). That drives up cost. It also increases privacy, regulatory, and cyber risk, especially in highly regulated environments.
2. AI is making retention decisions more urgent
One of the clearest themes from the discussion was that AI is adding pressure to an already difficult problem.
AI tools create new categories of data, including prompts, summaries, transcriptions, and generated content. In many cases, users do not realize that information is being saved at all, much less how long it may remain available. Some tools also still lack mature retention and legal hold functionality, which makes routine disposition harder.
At the same time, AI is exposing another issue: poor retention practices can weaken the quality of enterprise AI tools. Reliable AI outputs depend on reliable data. If an AI system is pulling from outdated or low-value content, the outputs it provides will be lower-quality or inaccurate. That can create internal confusion, increase business risk, and undermine trust in the tool itself.
The panel advised that if an AI tool is going to be used broadly, retention settings and data governance rules should be addressed early.
3. Retention and preservation are not the same thing
Another important takeaway was how often employees conflate general retention with legal hold preservation.
Preservation is triggered by a legal hold or anticipated litigation. Retention and disposition govern what happens to data when no legal hold applies.
Confusing the two leads to disposition paralysis. Employees become reluctant to delete anything, even when there is no legal obligation to keep it. Over time, that mindset creates unnecessary volume and unnecessary risk.
The panel’s message was clear: reasonable, consistent retention practices are not only defensible, they are necessary.
4. Legacy data and paper are still major problems
For all the attention on AI, the panel made clear that older data sources remain a major challenge.
Paper records (often stored in poorly labeled boxes), legacy archives, old servers, microfiche, and backup tapes still create cost and risk for many organizations. In many cases, teams do not know enough about what is in those boxes or legacy systems to make confident disposition decisions.
That is part of what makes the problem so difficult. There are still few effective tools to solve it at scale. For many organizations, the work remains manual, slow, and resource-intensive.
5. Good retention programs are practical, consistent, and usable
The session closed on a pragmatic note. The goal is not perfection. The goal is a retention program that works.
That means creating schedules employees can understand, identifying clear business owners, understanding where data lives, and involving the right stakeholders in the process. It also means recognizing that many retention decisions are business decisions, not purely legal ones.
Several speakers emphasized the value of cross-functional governance, with IT, privacy, security, legal operations, and records stakeholders all at the table.
They also made clear that organizations do not need to wait for the perfect moment or a full-scale overhaul to make progress. Real improvement can start with updating schedules, tightening governance, creating cross-functional forums, and making retention a more visible part of how data decisions are made.
Learn how Lighthouse helps organizations reduce risk through smarter governance, defensible data decisions, and modern discovery strategy.



.jpg)