Host LLM in GPU TEE

On this page

Overview
Implementation
References

Overview

Private AI or called confidential AI addresses critical concerns such as data privacy, secure execution, and computation verifiability, making it indispensable for sensitive applications. As illustrated in the diagram below, people currently cannot fully trust the responses returned by LLMs from services like OpenAI or Meta, due to the lack of cryptographic verification. By running the LLM inside a TEE, we can add verification primitives alongside the returned response, known as a Remote Attestation (RA) Report. This allows users to verify the AI generation results locally without relying on any third parties.

Implementation

The implementation for running LLMs in GPU TEE is available in the private-ml-sdk GitHub repository. This project is built by Phala Network and was made possible through a grant from NEARAI. The SDK provides the necessary tools and infrastructure to deploy and run LLMs securely within GPU TEE.

References

Building Multi-Proof System with ZK and TEE GPU TEE Inference API

Getting Started

TEEs, Attestation & Zero Trust Security

Phala Cloud User Guides

Be Production Ready

Use Cases

Confidential AI

References

FAQs

Host LLM in GPU TEE

Overview

Implementation

References

Getting Started

TEEs, Attestation & Zero Trust Security

Phala Cloud User Guides

Be Production Ready

Use Cases

Confidential AI

References

FAQs

​Overview

​Implementation

​References

Overview

Implementation

References