The Problem with “Shadow AI”
AI is useful. That does not make every tool appropriate for client or case data.
One of the biggest mistakes firms make right now is treating all AI products as interchangeable. They are not. A consumer chatbot and a purpose-built legal AI platform may both accept document uploads. The similarity ends there. What happens to that data after upload — where it is stored, who can access it, whether it trains the model, whether it is encrypted, whether it is isolated from other customers — is the difference between a professional workflow and a liability.
This is not a theoretical risk. Firms are uploading medical records, financial documents, and privileged communications into consumer tools every day. Most do not know where that data goes after they close the browser tab.
For the current plain-language version of that evaluation, see the security page.
- Data training exposure. Many consumer AI tools use uploaded content to improve their models. Your client’s medical records or settlement figures could influence outputs served to other users.
- No data isolation. Consumer tools do not separate one customer’s data from another at the infrastructure level. There is no organizational boundary.
- No audit trail. If a paralegal uploads case files into a personal AI account, the firm has no record of what was shared, when, or with which tool.
- No access controls. Consumer tools do not offer firm-level user management, role-based access, or case-level permissions.
- No traceability. Consumer AI outputs are not cited. If the tool produces an inaccurate summary, there is no way to trace the error back to a specific document.
The Attorney-Client Privilege Question
This is the risk that should keep managing partners up at night.
Attorney-client privilege depends on confidentiality. When privileged documents are uploaded into a consumer AI tool — one that may store data on shared infrastructure, use it for model training, or make it accessible to the vendor’s employees — there is a credible argument that the privilege has been waived.
Courts have not fully addressed this question yet. But the direction is clear. Uploading privileged material into a system where confidentiality cannot be guaranteed is difficult to defend. The safest position is simple: privileged and sensitive documents should only be processed inside systems where the firm controls access, the data is encrypted and isolated, and no third party uses it for any purpose beyond the firm’s own analysis.
Waiting for case law to settle this question is not a strategy. It is an exposure.
The Medical Records Problem
For firms handling personal injury, defense litigation, or medical malpractice, the stakes are even higher. Medical records contain protected health information. Uploading PHI into a consumer AI tool that lacks encryption standards, data isolation, and clear data handling policies creates compliance exposure that extends beyond privilege into regulatory territory.
A responsible platform for medical record analysis should meet minimum standards that most consumer tools do not:
- Encryption at rest and in transit. AES-256 at rest. TLS 1.3 in transit. Non-negotiable.
- Organization-level data isolation. Your firm’s data must be completely separated from every other customer’s data at the infrastructure level.
- No model training on client data. The AI provider must explicitly guarantee that uploaded documents are never used to train or improve models.
- Multi-factor authentication and SSO. Access to the platform should require MFA at minimum, with enterprise SSO support for firms managing multiple users.
- Audit trail on every analysis. Every upload, every extraction, every query should be logged with version history.
These are not premium features. They are baseline requirements for any tool that touches legal or medical data.
Review one real matter inside the security model you actually need.
Test the cited workflow on a live case and evaluate the controls before you expose client records to a broader rollout.
Why Data Governance Matters Now
Legal documents often contain the most sensitive information a client has — medical histories, financial records, privileged strategy discussions, and confidential communications. The obligation to protect that information does not pause because a new tool is convenient.
Before using any AI system for case work, a firm should be able to answer seven questions with confidence:
- Where is the data stored, and in which jurisdiction?
- Is the data encrypted at rest and in transit?
- Is the firm’s data isolated from other customers at the infrastructure level?
- Is the data used to train or improve AI models?
- Who at the vendor has access to uploaded documents?
- Is there a complete audit trail of all uploads, analyses, and outputs?
- Can every AI output be traced back to a specific page in a specific source document?
If the answer to any of these is “I don’t know,” the tool is not ready for legal work.
Discipline Over Convenience
Convenience creates bad habits. When under time pressure — and legal teams are always under time pressure — the temptation is to drag documents into the fastest tool available. A consumer chatbot is free, fast, and requires no approval. It is also uncontrolled, unaudited, and potentially in violation of the firm’s ethical obligations.
Professional AI use starts with discipline:
- Establish a firm-wide AI policy. Define which tools are approved for client data and which are not. “Use good judgment” is not a policy.
- Restrict sensitive data to purpose-built platforms. Medical records, privileged communications, and financial documents should only be processed inside tools that meet the security baseline. No exceptions for urgency.
- Require traceability on AI outputs. Any AI-generated work product used in case preparation should include citations back to source documents.
- Audit regularly. Know what tools your team is actually using. Shadow AI thrives in firms that do not ask the question.
- Evaluate vendors on security, not just features. A tool that processes documents faster but cannot confirm where the data goes is not a better tool. It is a faster way to create exposure.
Speed without security is not a real advantage. It is just unmanaged risk.
Sources and References
These public sources are a useful baseline when a firm is deciding whether an AI tool is ready for real client data.
- HHS HIPAA Security Rule sets the baseline for how protected health information should be safeguarded.
- NIST AI Risk Management Framework gives a public checklist for governance, documentation, and trustworthy AI operations.
- ABA Formal Opinion 512 explains that lawyers still have duties of confidentiality, oversight, and competence when using generative AI.
Use one live record set as the security test.
Evaluate the workflow, the citations, and the data-handling model on a real matter before you let the process expand firmwide.
The legal profession has always held itself to a higher standard on confidentiality. AI does not change that obligation. It raises it.
The firms that get this right will build a competitive advantage on trust — with clients, with courts, and with their own teams.
It will not be theoretical.
For a defense-specific view of how that trust model fits record-heavy matters, see the defense litigation workflow.
This is Post 2 in our series on The Future of Responsible Legal AI. Previously: Post 1 — Trust Must Be Earned. Next: Post 3 — Source-Grounded AI.