What Is Zero Data Retention in AI Coding Tools and Why It Matters for Enterprise Security

May 20, 2026

When a security team blocks an AI coding tool from production use, the rejection rarely comes down to a single issue. But zero data retention—or the absence of a credible claim to it—shows up on nearly every denial list. Engineers want to move faster. Security teams need to know that proprietary code, prompts, and model outputs aren't sitting somewhere outside the organization's control. Most AI coding tools respond with a policy statement. That is not the same as proof.

This article explains what zero data retention means for AI coding tools specifically, why the gap between a policy commitment and an infrastructure-level guarantee matters in enterprise security reviews, and what to look for when evaluating tools against SOC 2, HIPAA, FedRAMP, or internal AI governance requirements.

What Zero Data Retention Actually Means

Zero data retention (ZDR) means that data submitted to a system—prompts, code snippets, model inputs, outputs—is not stored, logged, or used after the session ends. In the context of AI coding tools, that data includes your source code, the context you pass to the model, and anything the model generates in response.

The term gets used loosely. Some vendors use it to mean they don't train shared foundation models on your data. Others mean they don't persist data beyond a short window. A strict ZDR guarantee means none of the following occur:

- Prompts or code are logged to a persistent store - Model inputs are retained for debugging, telemetry, or retraining - Session outputs are cached beyond the active request - Any residual data remains after the session is torn down

For regulated industries, the relevant question isn't whether a vendor says this is true. It's whether the architecture makes it technically impossible to do otherwise.

Why AI Coding Tools Create a Data Retention Problem

Standard AI coding tools route your code through third-party model infrastructure. When you ask a model to complete a function, refactor a module, or explain a bug, your code travels to an external endpoint, gets processed, and a response comes back. What happens to that data in transit and at rest depends entirely on the vendor's infrastructure and contractual commitments.

For most organizations outside regulated industries, that's an acceptable tradeoff. For a healthcare company with patient data in its codebase, a defense contractor working on export-controlled software, or a financial institution subject to SOC 2 and GLBA, it isn't.

The specific risks ZDR is meant to address:

Training leakage. If a vendor uses customer prompts to improve shared models, proprietary logic, business rules, or sensitive identifiers can surface in completions for unrelated users. Most enterprise plans prohibit this contractually—but policy prohibition is not the same as architectural prevention.

Audit exposure. Regulators and auditors ask where data went and who had access to it. If you can't answer that with a documented, verifiable record, you have a gap.

Breach surface. Data that isn't retained can't be exfiltrated. Every log, cache, or telemetry record that persists is a potential exposure point. ZDR reduces the breach surface to zero for session data—provided the claim is architecturally enforced, not just stated.

Policy-Based vs. Infrastructure-Level ZDR: A Critical Distinction

This is where most enterprise security reviews find the gap.

A policy-based ZDR commitment means the vendor has agreed—contractually or in their terms of service—not to retain your data. GitHub Copilot Enterprise offers a ZDR option where prompts and suggestions aren't stored. Cursor's privacy documentation makes similar commitments for business accounts. These are real commitments with legal weight.

But a policy can be violated. Infrastructure can be misconfigured. A vendor's subprocessor may have different retention defaults. A logging system can be inadvertently re-enabled. The policy describes what should happen. It doesn't prove what did happen.

Infrastructure-level ZDR means the system is built so that retention is technically impossible or immediately verifiable. Ephemeral execution environments that are torn down at session end—with no persistent storage attached—make retention structurally impossible rather than contractually prohibited. When the enclave is destroyed, the data is gone. There's no log to misconfigure.

For high-assurance environments, the question to put to every vendor is: "Can you show me, with a verifiable record, that no data persisted after my session ended?" Most cannot. That is the gap.

What ZDR Alone Cannot Prove

Zero data retention addresses one part of the security picture. It doesn't answer several questions that regulated buyers still need to answer:

What ran during the session? ZDR tells you data wasn't kept after the fact. It doesn't tell you whether the model execution environment was isolated, whether another tenant's workload shared memory with yours, or whether an agent took actions outside its intended scope.

Was the execution environment trustworthy? A model can run in a ZDR-compliant environment that is still shared infrastructure. Memory isolation requires hardware-level guarantees, not software controls alone.

Can you prove any of this to an auditor? A vendor's attestation that ZDR was enforced is a claim. A cryptographic attestation record tied to a specific session, generated by hardware that cannot be spoofed, is evidence.

This is why ZDR, while necessary, isn't sufficient for regulated workloads. The complete picture requires ZDR plus hardware-level isolation plus verifiable proof that both were enforced during each specific session.

How Regulated Industries Should Evaluate ZDR Claims

When your security team reviews an AI coding tool, the ZDR evaluation should cover five questions:

1. Is ZDR enforced at the infrastructure level or by policy? Ask for architecture documentation. Look for ephemeral execution environments, absence of persistent logging infrastructure, and technical controls that prevent retention—not policies that prohibit it.

2. Does ZDR apply to all data types? Prompts, code context, model outputs, and any session metadata. Some vendors enforce ZDR on model inputs but retain telemetry or usage logs. Each of those is a potential exposure.

3. What happens to data at session end? Is the execution environment torn down? Is memory cleared? Is there a documented teardown process, and is it auditable?

4. Does the vendor use subprocessors, and do those subprocessors carry equivalent ZDR guarantees? A vendor's ZDR commitment is only as strong as its weakest subprocessor. Request the subprocessor list and their data handling terms.

5. Can ZDR enforcement be verified, not just asserted? This is the question most vendors cannot answer. Verification requires either independent audit records or cryptographic proof tied to specific sessions.

What a Verifiable ZDR Architecture Looks Like

A verifiable ZDR architecture combines three things: ephemeral execution environments, hardware-level memory isolation, and cryptographic attestation per session.

Ephemeral environments mean each coding session runs in an isolated container or enclave created for that session and destroyed when it ends. No data persists to shared storage. No residual state carries over.

Hardware-level memory isolation means the execution environment's memory is encrypted and inaccessible to the host operating system, the cloud provider, and other tenants. Intel TDX (Trust Domain Extensions) provides this guarantee at the hardware level. Memory inside a TDX-protected enclave is encrypted in a way that even a compromised host cannot read.

Cryptographic attestation means the hardware generates a signed record proving that a specific workload ran inside a verified enclave with defined security properties. That record is exportable—it can be presented to an auditor, attached to a compliance report, or ingested into your SIEM. It is not a vendor's claim. It is hardware-signed evidence.

Origin is built on this architecture. Each coding session runs inside an Intel TDX hardware-isolated sandbox. The OLLM Confidential AI Gateway routes requests to either standard zero data retention LLMs or to models running inside Trusted Execution Environments, depending on the sensitivity of the work. When a session ends, the enclave is torn down and no residual data remains. For every confidential session and every TEE model run, Origin generates a cryptographic attestation record you can export and use directly in compliance audits.

That is the difference between a ZDR policy and a ZDR proof.

For security teams evaluating AI coding tools against FedRAMP, HIPAA, or SOC 2 requirements, the architecture matters more than the contract. A policy describes what a vendor intends. An attestation record proves what the hardware did.

FAQs

What is zero data retention in the context of AI coding tools? Zero data retention means that prompts, code, model inputs, and outputs submitted during an AI coding session are not stored, logged, or reused after the session ends. In a strict ZDR implementation, no session data persists once the interaction is complete.

Is a zero data retention policy the same as a zero data retention guarantee? No. A policy is a contractual or terms-of-service commitment that data will not be retained. A guarantee requires the architecture to make retention technically impossible—through ephemeral execution environments destroyed at session end, with no persistent storage attached.

Why does zero data retention matter for HIPAA, SOC 2, and FedRAMP compliance? These frameworks require organizations to demonstrate control over where sensitive data goes and who can access it. If proprietary code or patient-adjacent data is processed by a third-party AI tool and retained in that vendor's infrastructure, your organization may not be able to satisfy data handling, audit, or breach notification requirements.

Can GitHub Copilot or Cursor satisfy enterprise ZDR requirements? Both offer ZDR options at the enterprise tier, but those commitments are policy-based rather than architecturally enforced. Neither tool produces cryptographic attestation records proving that ZDR was enforced during a specific session—which is the level of evidence high-assurance environments require.

What is the difference between ZDR and hardware-level memory isolation? ZDR addresses whether data is stored after a session ends. Hardware-level memory isolation—provided by technologies like Intel TDX—addresses whether data is accessible during a session to the host, cloud provider, or other tenants. Both are required for a complete security posture in regulated environments.

What is cryptographic attestation and how does it relate to ZDR? Cryptographic attestation is a hardware-signed record proving that a specific workload ran inside a verified, isolated execution environment with defined security properties. It provides auditable evidence that ZDR and isolation controls were active during a specific session, rather than relying on a vendor's assertion after the fact.

How should a security team verify ZDR claims during a vendor evaluation? Request architecture documentation showing ephemeral execution environments and the absence of persistent logging. Ask whether the vendor uses subprocessors and whether those subprocessors carry equivalent guarantees. Ask whether the vendor can produce session-level cryptographic attestation records. If the answer to that last question is no, the ZDR claim is policy-based, not proof-based.

Zero data retention is a necessary requirement for AI coding tools in regulated environments—but it isn't sufficient on its own. The complete security posture requires ZDR enforced at the infrastructure level, hardware-level memory isolation during execution, and cryptographic proof that both controls were active for each session. Most tools offer the first. None of the last two.

If your team is evaluating AI coding tools against a compliance mandate or a failed security audit, learn more at orgn.com.