
Chutes (SN64) End-to-End Encryption Is Live | Bittensor
Chutes (SN64) just shipped end-to-end encryption for AI inference. The feature encrypts prompts directly on the developer’s machine and sends them to a specific GPU instance running inside a Trusted Execution Environment (TEE). The encrypted data passes through the Chutes API and load balancers, but neither layer can read it. No one in the chain can. Not the network, not Chutes itself, and not the miners operating the hardware.
How Chutes (SN64) End-to-End Encryption Works
The encryption stack uses ML-KEM 768, a NIST-standardized post-quantum key encapsulation mechanism. It combines this with HKDF-SHA256 for key derivation and ChaCha20-Poly1305 for authenticated encryption.
Each TEE instance publishes an ML-KEM public key. Every request uses a fresh ephemeral client keypair to provide forward secrecy. This means that even if an attacker captured every packet today, future quantum computers still could not decrypt them.
The post-quantum design addresses a growing concern in cybersecurity. So-called “harvest now, decrypt later” attacks involve adversaries collecting encrypted traffic now and waiting for quantum hardware to break it later. ML-KEM 768 is built to prevent exactly that scenario.
Two Ways to Use Chutes (SN64) End-to-End Encryption
The feature supports two integration paths depending on the developer’s setup.
The first option targets developers already using the OpenAI Python SDK. They install the chutes-e2ee package and pass a custom transport into their client. The base URL stays the same. Encryption happens transparently at the HTTP layer with minimal client wiring.
The second option works with any other client or platform. Developers run the e2ee-proxy Docker container locally and point their client at it. The proxy supports both OpenAI-compatible APIs, including the newer Responses API spec used by tools like Codex, and Anthropic’s Messages API spec for Claude-style clients. The proxy handles format translation, key exchange, encryption, and streaming decryption automatically.
Streaming works with both options. Normal token-based billing semantics are preserved regardless of which path a developer chooses. The code is open source under the MIT license and available on GitHub through the Chutes Global Corp organization.
What This Means for Bittensor Subnet 64
Chutes (SN64) end-to-end encryption works across all models on the platform, with the strongest privacy guarantees on TEE-enabled models. The feature removes the infrastructure provider from the trust chain entirely. In practice, this means that the privacy guarantee is not contractual. It is mathematically enforced.
This positions Chutes differently from most AI providers, both centralized and decentralized. Traditional providers ask developers to trust that their data is handled responsibly. Chutes argues that with end-to-end encryption and TEEs, trust is no longer required because the system makes reading the data technically impossible.
Whether this level of security becomes a meaningful differentiator for enterprise adoption remains an open question. However, the technical implementation itself is serious. Post-quantum cryptography at the inference layer is not something most AI platforms offer today, centralized or otherwise.
Check it:
https://github.com/chutesai/e2ee-proxy
https://github.com/chutesai/chutes-e2ee-transport
FAQ:
Chutes (SN64) end-to-end encryption is a feature that encrypts AI inference requests on the developer’s machine before they leave it. The encrypted data travels through the Chutes network to a GPU running inside a Trusted Execution Environment. No intermediary, including Chutes itself, can decrypt the content.
The system uses ML-KEM 768 for post-quantum key encapsulation, HKDF-SHA256 for key derivation, and ChaCha20-Poly1305 for authenticated encryption. Every request generates a fresh ephemeral keypair for forward secrecy.
There are two paths. OpenAI Python SDK users install the chutes-e2ee package and pass a custom transport to their client. Everyone else runs the e2ee-proxy Docker container locally. Streaming works with both options, and the code is open source under the MIT license.
ML-KEM 768 is a NIST-standardized algorithm specifically designed to resist attacks from future quantum computers. The ephemeral keypair per request adds forward secrecy, which protects past communications even if long-term keys are compromised in the future.


