Zero Data Retention AI: Policy Promises vs. Hardware Proof

June 12, 2026

TL;DR

Real Zero Data Retention means that no prompts, completions, or telemetry are stored anywhere in the pipeline. It is fundamentally different from a simple promise not to train models on your data.
Policy-based protection relies on contractual agreements that cannot physically prevent misconfigurations or subprocessor leaks. Hardware-enforced protection uses isolated enclaves that make data storage structurally impossible.
True hardware isolation generates a separate cryptographic receipt for each inference request. Security auditors use this specific artifact to verify compliance instead of trusting a vendor's broad security statement.
Regulatory frameworks like HIPAA and FedRAMP increasingly demand structural evidence of data minimization over basic vendor assurances. An exportable hardware attestation directly satisfies these strict audit and access controls.
AI coding assistants ingest deep repository context that often contains unprotected secrets and core business logic. Securing this specific workflow requires ephemeral sandboxes that are destroyed when the session ends.
A primary vendor security guarantee fails immediately if its underlying compute providers use standard logging defaults. Procurement teams must verify data-handling rules throughout the entire execution chain to prevent exposure.

Why "We Don't Store Your Prompts" Stopped Being Enough

Enterprise security teams are blocking specific tools because the vendor's data handling story doesn't hold up under scrutiny. A policy statement and an architecture diagram are different documents, and auditors know it.

What Zero Data Retention Means at the Infrastructure Level

A strict ZDR guarantee covers four things:

prompts are not stored,
Model completions are not stored.
session telemetry tied to content is not stored, and
No residual data remains after the request completes.

None of those four things is the same as "we don't train on your data," which is a separate and weaker claim that many vendors conflate with ZDR.

The retention risk surface inside a typical AI inference pipeline has four distinct layers:

The model provider's logging layer: most providers log requests by default to monitor for abuse. OpenAI's standard API retains customer data for 30 days for this purpose. ZDR is an approved exception requiring separate eligibility, not the default behavior.
The gateway or proxy layer: any AI gateway sitting between your application and the model provider has its own logging and telemetry infrastructure that may capture request content independently of the model provider's policies.
Subprocessor infrastructure: a vendor's ZDR commitment is scoped to their services. If their infrastructure runs on subprocessors with different default retention postures, the top-level commitment may not propagate.
Abuse-monitoring caches: even with ZDR enabled, some providers retain data temporarily when classifiers flag policy violations.

Where the Definition Gets Stretched and Why It Matters

Policy-based ZDR means the vendor has a contractual commitment not to retain your data. Infrastructure-level ZDR means the system is built so that retention is technically impossible or immediately verifiable through hardware evidence, and both are present. Neither is inherently dishonest, but conflating them in a compliance review creates real gaps.

Consider the concrete failure modes that SOC 2 auditors and HIPAA reviewers explicitly probe for: a logging system re-enabled after a configuration deployment; a subprocessor with different default retention settings that wasn't updated when the top-level vendor added ZDR; a telemetry collector that captures request metadata, including partial content for performance monitoring. Each of these represents a real class of misconfiguration, not a theoretical attack. A policy that prohibits retention doesn't prevent any of them. Architecture that makes retention structurally impossible does.

The practical question for a security review is: "Can the vendor show me, per session, that no data persisted after the request completed?" Most vendors cannot. That gap is what hardware-enforced ZDR addresses.

The Compliance Frameworks That Are Pulling ZDR Into Procurement Checklists

ZDR has moved from a nice-to-have to a procurement gate across several regulatory frameworks, each driving the requirement from a different angle.

Gartner's 2025 AI TRiSM market guidance notes that by 2026, organizations applying AI trust, risk, and security management controls will consume at least 50% less inaccurate or illegitimate information. Separately, Gartner predicts that by 2028, 50% of organizations will implement zero-trust data governance postures specifically due to risks posed by unverified AI-generated data. The regulatory mapping looks like this:

GDPR Article 5(1)(c): Data must be adequate, relevant, and limited to what is necessary for the stated purpose. ZDR is the operational implementation of the data minimization principle for AI inference.
HIPAA 45 CFR §164.312: Technical safeguard requirements for access controls, audit controls, and transmission security. When AI processes PHI, the vendor's data handling architecture becomes part of the covered entity's technical safeguard documentation.
SOC 2 Type II (Confidentiality criteria): Auditors require documented evidence that sensitive data was handled in accordance with stated controls. Per-request attestation receipts satisfy this requirement in a way that a vendor assertion does not.
ISO 27001 (Clause 6.1.3 and Annex A.15): Supply chain risk assessment. AI model providers handling customer data are in scope as third-party suppliers requiring documented risk treatment.
ISO 42001: AI-specific management system standard that reinforces data flow documentation requirements for AI inference chains.
FedRAMP: Doesn't use the term ZDR, but data-handling or Controlled Unclassified Information (CUI) effectively requires an architecture. Hardware attestation with verifiable execution boundaries maps directly to FedRAMP's boundary and data protection requirements.

The Two Architectures Behind Zero Data Retention Claims

Vendors use "zero data retention" to describe architectures that behave very differently under the hood. The two implementation models differ significantly in what each can prove, which determines which workloads each is appropriate for.

Policy-Enforced ZDR: Contractual Guarantees Across the AI Inference Chain

Policy-enforced ZDR works through contractual and operational controls rather than hardware isolation. Under this model, AI providers, gateways, and infrastructure operators commit not to retain prompts, completions, or associated content beyond the scope of request processing. Several AI platforms and inference gateways now support policy-based ZDR options, allowing organizations to process sensitive workloads without storing customer content or using it for model training.

ORGN's platform supports this model through its ZDR model catalog, providing access to frontier models operating under documented zero-retention agreements. Depending on the provider, ZDR can be enforced at the organization level or enabled selectively for specific requests and workflows.

What policy-enforced ZDR provides:

Inference providers do not store or log prompts or outputs
No training on customer data
Broad frontier model catalog: Claude, GPT-series, Gemini, Llama, Mistral, and image/video/embedding models not available in TEE environments
Contractual, auditable provider agreements

What it doesn't provide:

No cryptographic proof of execution
No hardware-level memory isolation (data may pass through shared cloud infrastructure)
No independently verifiable attestation receipt: the guarantee is contractual, enforced through provider agreements, not hardware that makes violation technically impossible

According to ORGN's security documentation, the ZDR model execution path runs on Vercel infrastructure with the explicit acknowledgment: "No attestation receipt or cryptographic proof of execution." The privacy guarantee is policy-enforced. For many workloads, that is adequate. Run automated security scans on your repository within Origin.

Hardware-Enforced ZDR via Trusted Execution Environments (TEE)

Intel TDX (Trust Domain Extensions) confidential virtual machines combined with NVIDIA H100 GPU attestation provide a categorically different class of guarantee. Execution happens inside a hardware-isolated enclave where:

Prompts and responses are encrypted in memory during inference
The host OS, hypervisor, cloud provider, and ORGN’s Gateway itself cannot read the plaintext data
Memory decryption occurs only inside the trusted execution boundary
When the enclave is torn down after the request completes, no residual data remains

Per ORGN's architecture documentation, TEE models run on NEAR and Phala infrastructure using Intel TDX-based confidential virtual machines with NVIDIA H100 GPU attestation. Every inference request produces a cryptographic attestation receipt. The security guarantee isn't a policy about what the vendor will do. It's a physical constraint on the hardware.

The Dashboard surfaces three separate data sources, task state, active execution environments, and open pull requests, without requiring you to leave Origin or open GitHub. Everything visible here updates in real time as agents and teammates make changes to the project.

The threat model ORGN Gateway's TEE architecture explicitly addresses infrastructure-level compromise: "Host operating systems, hypervisors, and cloud administrators cannot inspect inference data due to hardware-enforced memory isolation." A misconfigured logging system outside the enclave captures nothing, because nothing outside the enclave has access to the plaintext.

Session Binding and the Unified TDX Plus GPU Attestation Model

Per-request proof requires more than each component attesting independently. In ORGN Gateway's TEE architecture, the Intel TDX quote and NVIDIA GPU evidence are cryptographically bound together using a shared session nonce. The nonce appears in both TDX REPORT_DATA[32:64] and in each GPU's SPDM evidence header. A match across all components proves every attestation was generated in the same session.

Without session binding, an attestation receipt from session A could theoretically be presented as proof for session B. The nonce binding prevents this: the CPU and GPU attestations are inseparably tied to a specific inference execution.

REPORT_DATA (64 bytes):

Bytes [0:32]: model_signing_address # identifies the model signing authority

Bytes [32:64]: GPU session nonce # binds this TDX quote to the GPU attestations

Each GPU's SPDM evidence header carries the same nonce. Verifying a match across the TDX quote and all GPU evidence headers confirms the attestation covers a single, unified execution session, not a patchwork of components from different sessions.

What an Attestation Receipt Contains and How to Use It

Most ZDR vendor documentation describes the architecture. An attestation receipt is an artifact: something you can export, inspect, verify, and attach to a compliance record. The distinction matters when an auditor asks for evidence rather than a description.

The Three Components of a Cryptographic Attestation Receipt

Per ORGN Gateway's attestation data reference, every TEE inference request produces a three-part receipt.

1. Intel TDX Quote

A DCAP (Data Center Attestation Primitives) Quote v4 binary, hex-encoded, signed by Intel's Quoting Enclave. The quote contains:

An ECDSA-P256 signature over the quote body
PCK certificate chain rooted in Intel's Root CA
TD measurements: MRTD (code identity), RTMRs (runtime measurements), MRCONFIG (configuration identity)
REPORT_DATA (64 bytes): model signing address in bytes [0:32], GPU session nonce in bytes [32:64]
TEE TCB SVN: firmware security version verifiable against Intel's Provisioning Certification Service

2. NVIDIA GPU Evidence

Per-GPU attestation for each H100 in the cluster:

X.509 certificate chain rooted in NVIDIA's Root CA (revocation status verifiable via ocsp.ndis.nvidia.com)
SPDM measurement report signed by the GPU's device attestation key
Firmware measurements verified against NVIDIA's signed Reference Integrity Manifests (RIMs), covering driver firmware and VBIOS firmware
OpaqueData fields containing driver version, VBIOS version, chip SKU (used to identify the correct RIM for verification)

3. Message Signature

An ECDSA signature binding model identity to the exact request and response content:

EIP-191(text: "{model}:{sha256(request_body)}:{sha256(response_body)}")

The signature covers:

model: the model identifier, e.g., zai-org/GLM-5-FP8
sha256(request_body): SHA-256 hex hash of the inference request
sha256(response_body): SHA-256 hex hash of the inference response
model_signing_address: model identity hash matching TDX REPORT_DATA[0:32]

None of these three components requires trusting ORGN. Verification runs entirely against Intel's and NVIDIA's public PKI infrastructure.

How to Verify the Receipt Without Trusting the Platform

The independent verification path, as specified in ORGN Gateway attestation documentation:

# Step 1: Verify the Intel TDX Quote

# The ECDSA signature and PCK certificate chain verify against Intel's Root CA

# TCB status checks against Intel's PCS APIintel_pcs_url = "https://api.trustedservices.intel.com"

# Step 2: Verify NVIDIA GPU certificate chains# Each GPU cert chain verifies against NVIDIA's Root CA

# Revocation status checked via NVIDIA's OCSP servicenvidia_ocsp_url = "https://ocsp.ndis.nvidia.com"

# Step 3: Verify firmware measurements

# Compare GPU firmware measurements against NVIDIA's signed Reference Integrity Manifestsnvidia_rim_url = "https://rim.attestation.nvidia.com"

# Step 4: Verify session binding# Confirm the GPU session nonce in TDX REPORT_DATA[32:64] matches

# the nonce in each GPU's SPDM evidence headerassert tdx_quote.report_data[32:64] == gpu_evidence[0].spdm_nonceassert tdx_quote.report_data[32:64] == gpu_evidence[1].spdm_nonce

# ... for all 8 H100s

# Step 5: Verify the message signature

# Recover the Ethereum address from the EIP-191 signature

# Confirm it matches the message_signer field

recovered_address = recover_eip191_signer(

message=f"{model}:{sha256(request_body)}:{sha256(response_body)}",

signature=ecdsa_signature

)

assert recovered_address == message_signer

Every verification step runs against external trust anchors. ORGN's Gatewayinfrastructure is not in the verification path at any point.

From Receipt to Compliance Evidence: What Auditors Can Use

A cryptographic attestation receipt tied to a specific session is exportable, tamper-evident, and independently verifiable. Presented to a SOC 2 Type II auditor, it demonstrates that the confidentiality control (hardware-isolated execution, no persistent storage) applies to a specific request, not to the platform in aggregate. Presented to a HIPAA reviewer, it demonstrates that PHI processed through AI was handled inside a hardware-isolated environment where no storage was possible. Presented to a FedRAMP assessor, it provides hardware-rooted evidence that the execution boundary was maintained for the specific workload under review.

The difference from a vendor assertion is structural. An assertion says: "Our platform enforces ZDR." An attestation receipt says: "For request ID chatcmpl-893c78e06a795cea, the following hardware-signed evidence proves the model ran inside a verified TEE with these specific measurements, and the request and response content were not accessible outside the enclave." ORGN's Scanner product manages attestation artifacts and audit trails across the full ORGN stack, providing a centralized view of verification status per request.

ZDR in AI Coding Tools: A Different Risk Surface Than Model APIs

Production AI APIs and AI coding tools transmit different kinds of data. The risk profile isn't the same, and ZDR requirements for coding tools deserve separate treatment in any enterprise security review.

What Data AI Coding Tools Transmit During a Session

When a developer uses an AI coding tool, the context window sent to the model typically includes open files, recent edits, imports, project structure, and the specific prompt. For tools using repository-wide RAG or large context windows, the transmitted content may also include:

Environment files containing API keys and credentials
Internal business logic and trade-secret algorithms
Infrastructure configuration files (Kubernetes manifests, Terraform configs, CI/CD pipelines)
Authentication patterns and access control implementations
Export-controlled technical specifications for defense contractors
PHI present in test data, fixture files, or embedded configuration

A chat completion API call with a sanitized prompt is a much narrower exposure than a coding tool session that ingests an entire repository context. Both deserve ZDR coverage, but the data classification of what's transmitted differs significantly.

GitHub Copilot Enterprise offers a ZDR option where prompts and suggestions are not stored. Cursor's business tier makes equivalent commitments for business accounts. Both are policy-based commitments. For a defense contractor with ITAR-controlled software, or a healthcare organization with PHI in configuration files, the relevant question is whether those commitments are architecturally enforced or contractually stated.

How ORGN's CDE Combines Ephemeral Sandboxes with ZDR and TEE Execution

ORGN's Confidential Development Environment (CDE) mitigates the coding tool risk surface through hardware-level session isolation. Per ORGN's Gateway sandbox documentation, each coding session runs in an isolated TDX-backed Linux compute environment with:

Dedicated CPU, memory, and disk allocation per session
Auto-expiry after a configurable retention period
No session state carries over after teardown
Dedicated storage resources are isolated to the sandbox environment.

The sandbox infrastructure runs on the sandbox-c3-44 node pool: c3-standard-44 machines with 44 vCPUs and 176 GiB of RAM per node, scaling from 1 to 110 nodes, on TDX-backed infrastructure in GCP US Central. All three node pools in ORGN's infrastructure are tagged TDX (tdx-us-central1, control-tdx, and sandbox-c3-44), meaning they operate as confidential virtual machines throughout.

Each worktree entry represents one isolated execution environment. Once a session ends, the worktree disappears from this list and the underlying TDX-backed sandbox is destroyed, not cleaned up, taking everything in memory with it.

Within the CDE, inference requests are sent through the ORGN Gateway Confidential AI Gateway, where developers can choose between policy-enforced ZDR models and hardware-attested TEE models. The distinction is visible and intentional: ZDR models prioritize broad model availability with zero-retention guarantees, while TEE models prioritize hardware-enforced isolation and verifiable execution. ORGN does not automatically route requests between these options; model selection remains entirely under user control.

A coding workflow using the CDE looks like this at the infrastructure level:

Developer Session

│

▼

TDX-backed sandbox (c3-standard-44, encrypted memory)

│ ├─── Code context stays inside the encrypted sandbox

│ ├─── Inference request → ORGN Gateway

│ │ │ ├── ZDR model (Vercel) ── No persistence, broad catalog

│ │ │ └── TEE model (NEAR/Phala) ── HW-isolated + attestation receipt

│ ├─── Response returned inside sandbox

│ └─── Session teardown → enclave destroyed, no residual state

Nothing persists unless explicitly saved by the developer. The enclave teardown isn't a cleanup process. It's the destruction of the execution environment itself.

The Subprocessor Problem and Why Multi-Vendor AI Stacks Make It Worse

Enterprise AI deployments are rarely single-vendor. IBM's 2025 Cost of a Data Breach Report found that shadow AI (unapproved AI tools) added an average of $670,000 to breach costs and was present in 20% of breaches. Third-party vendor compromises were the second most frequent attack vector, accounting for 15% of all breaches, averaging $4.91 million per incident and a 267-day mean time to contain.

In a multi-model workflow where one agent writes the code, another reviews it, and a third generates the PR description, each step is a separate inference request with its own ZDR posture. The PR panel makes those parallel execution paths visible, but it doesn't tell you which model handled which request or what retention policy applied.

When different AI tools in a developer's workflow have different ZDR postures, the weakest one defines the organization's actual exposure. A gateway operator's ZDR commitment is only as strong as its weakest subprocessor's architecture. Security reviews should ask:

Does ZDR cover all data types (prompts, completions, session telemetry, and usage logs), or only model inputs and outputs?
Does the vendor publish a subprocessor list with documented data handling for each subprocessor?
Do those subprocessors carry equivalent ZDR guarantees with verifiable enforcement, or only with contractual commitments?
Can ZDR enforcement be verified per session, not just attested as a platform-level policy?

Evaluating ZDR Claims: A Framework for Security and Procurement Teams

Security teams reviewing AI tooling face a consistent problem: vendor documentation is written to pass a surface-level review, not to answer the questions that matter for regulated workloads. A structured evaluation approach helps separate credible implementations from well-worded policies.

Five Questions That Separate Architecture from Assertions

Put these questions to every AI vendor in a security review. The answers will reveal whether ZDR is built into the architecture or added on top of it.

1. Is ZDR enforced at the infrastructure level or by policy?

Ask for architectural documentation that shows the enforcement mechanism. Ephemeral execution environments, hardware memory encryption, and the absence of persistent logging infrastructure are architectural controls. Terms of service and data processing agreements are policy controls. Both are useful. Only one prevents misconfiguration.

2. Does ZDR apply to all data types?

Prompts and model outputs are the obvious scope. But telemetry, usage logs, session metadata, and abuse monitoring caches are equally important. Some vendors enforce ZDR on model inputs but retain operational telemetry that may include partial content. Each retained data type is a separate exposure point.

3. What is the documented teardown process at session end?

Is the execution environment destroyed, or is data deleted from persistent storage? Deletion is reversible (forensic recovery, backup systems). Destruction of the execution environment is not. Ask whether the teardown process is auditable and whether evidence of teardown is available.

4. What are your subprocessors, and do they carry equivalent ZDR guarantees?

Request the subprocessor list and the specific data-handling commitments for each subprocessor. A primary vendor's ZDR posture may not extend to the compute infrastructure, CDN, observability platform, or other subprocessors handling request data.

5. Can you provide per-session cryptographic evidence of ZDR enforcement?

A platform-level assurance says ZDR applies to the system in general. A per-session cryptographic receipt proves ZDR applied to a specific request at a specific time. For compliance documentation, the specific request matters, not just the platform policy.

The Cost of Getting It Wrong in Regulated Environments

The financial consequences of ZDR failures in regulated industries are well-documented.

HIPAA: A breach involving AI-processed PHI retained in a vendor log carries civil monetary penalties ranging from $100 to $50,000 per violation, capped at $1.9 million per violation category per year. Criminal penalties apply for knowing violations. Beyond fines, HIPAA breach notification requirements trigger public disclosure and OCR investigation.

GDPR: Article 83(5) fines for violations of data processing principles, including data minimization, can reach up to 4% of annual worldwide turnover. U.S. state privacy fines totaled $3.425 billion in 2025 per Gartner, a figure Gartner expects to grow through 2028 as enforcement shifts to full-scale action.

FedRAMP: Authorization can be revoked if a tool in scope retains CUI in violation of its documented authorization boundary. Re-authorization is an extended process that blocks the use of the tool across all covered systems.

Beyond regulatory penalties, a breach traceable to an AI vendor's log retention becomes a procurement disqualifier. Government and enterprise procurement reviews will ask directly whether the organization has experienced any AI-related data-handling incidents. A yes answer changes the risk profile for every subsequent contract.

Mapping Vendor ZDR Commitments to Specific Compliance Frameworks

The practical compliance mapping for procurement teams:

HIPAA (45 CFR §164.312): The key requirement is demonstrating that PHI processed through AI was handled under appropriate technical safeguards. A Business Associate Agreement covers the contractual obligation. Hardware attestation covers the technical proof. ZDR with per-request attestation satisfies both: PHI is processed without persistence, and the hardware receipt proves the isolated execution.

SOC 2 Type II (Confidentiality): Auditors require evidence that confidentiality commitments applied to specific data handling events, not just to the platform in aggregate. Per-request attestation receipts are exportable, tamper-evident evidence that can be presented per transaction rather than as a platform-level policy statement.

FedRAMP: Data handling controls for CUI require demonstrating that the processing environment was trusted and that no unauthorized retention occurred. Intel TDX attestation, rooted in Intel's public PKI, combined with NVIDIA GPU attestation, rooted in NVIDIA's public PKI, provides hardware-backed evidence of trusted execution boundaries, the kind of verifiable proof FedRAMP assessors can evaluate independently.

GDPR (Article 5(1)(c), Data Minimization): Data must be limited to what is necessary for the processing purpose. ZDR, enforced architecturally so that no data persists after the request completes, is the strictest possible implementation of data minimization for AI inference. Per-request receipts provide documentary evidence that minimization was applied.

Conclusion

Enterprise AI procurement reviews are no longer asking whether a vendor has a ZDR policy. Security teams are asking whether the architecture makes the policy verifiable. Policy-based ZDR, as implemented through Vercel's AI Gateway, provides real and legally enforceable data-handling guarantees backed by negotiated provider agreements, covers the broadest model catalog, and satisfies compliance requirements when contractual evidence is sufficient. Hardware-enforced ZDR, built on Intel TDX confidential VMs and NVIDIA H100 GPU attestation, goes further: retention becomes technically impossible during execution, a cryptographic receipt per request is generated that neither ORGN nor any party in the infrastructure chain can forge or revoke, and the result is independently verifiable evidence that FedRAMP assessors, HIPAA reviewers, and SOC 2 auditors can examine directly.

The choice between them isn't binary; ORGN's gateway supports both models through the same OpenAI-compatible API. The model identifier in the request determines which execution environment handles the inference. Teams handling source code in ITAR-controlled programs, PHI in production healthcare systems, or financial data under an active SOC 2 audit scope should route those specific workloads through the hardware-attested path and export receipts as part of their compliance documentation. For everything else, policy-enforced ZDR with a broad model catalog is a practical, compliant choice. What's not acceptable for regulated workloads is treating the two architectures as equivalent just because they share a three-letter acronym.

FAQs

1. What is the difference between zero data retention and training opt-out for AI models?

Training opt-out means the vendor won't use your prompts to improve shared foundation models. ZDR means the data is never stored at all after the request completes. A vendor can offer training opt-out while still retaining your data in logs for abuse monitoring, billing, or debugging for up to 30 days, as OpenAI does by default.

2. Can zero data retention be verified independently, or do organizations have to take the vendor's word for it?

Policy-based ZDR cannot be independently verified per session. You're relying on the vendor's adherence to contractual commitments and any audit evidence they provide. Hardware-enforced ZDR via TEE attestation produces a cryptographic receipt per request, verifiable against Intel's and NVIDIA's public PKI infrastructure without trusting the platform.

3. Does zero data retention satisfy HIPAA's technical safeguard requirements when AI processes protected health information?

ZDR addresses the data minimization and storage aspects of HIPAA's technical safeguard requirements, but it's one component of a complete HIPAA-compliant AI architecture. You also need a signed Business Associate Agreement with the AI vendor, access controls specifying who can submit PHI to the model, transmission security for data in transit, and audit controls documenting that PHI was handled in accordance with stated policies.

4. What does "zero data retention" cover in practice, and does it include telemetry, usage logs, and session metadata?

Coverage varies significantly by vendor and requires explicit confirmation. Strict ZDR covers all four categories: prompts, completions, session telemetry tied to content, and abuse-monitoring caches. Many vendor implementations cover only model inputs and outputs, while retaining operational metadata including request IDs, token counts, latency, model name, and provider, all of which ORGN Gateway retains for observability while explicitly excluding prompt and response content.