Run AI models with enterprise-grade security without sacrificing performance. Phala Cloud Confidential AI protects your models and data using GPU TEE - hardware-isolated environments that keep your AI workloads private and verifiable.

Why Confidential AI?

Traditional cloud AI deployments expose your models and data to the cloud provider. Confidential AI solves this by running everything inside hardware-protected TEE. Your models stay private, your data stays secure, and you get cryptographic proof that execution happened in a trusted environment.

Your Options

Phala Cloud offers two ways to deploy confidential AI:

API and Models

Get started quickly with pre-deployed models running in GPU TEEs. This option works best if you want to:
  • Use existing code: Drop-in replacement for OpenAI APIs
  • Access popular models: DeepSeek, Llama, GPT-OSS, and Qwen models ready to use
  • Verify execution: Get attestation reports proving your code ran in TEE
  • Pay as you go: Only pay for what you use

Confidential GPU

Deploy your own models on dedicated GPU TEE servers. Choose this when you need:
  • Custom models: Run your own fine-tuned or proprietary models
  • High performance: H200, H100, and B200 GPUs available
  • Full control: Configure CPU, RAM, storage, and location
  • Flexible scaling: 1 or 8 GPUs per server, various commitment options

Open Source Foundation

The underlying technology is open source. Check out the private-ml-sdk repository to see how LLMs run securely in GPU TEEs. This project was built by Phala Network with support from NEARAI.

Open Source Implementation: Private ML SDK