Phala Network Docs
  • Home
    • 👾Phala Network Docs
  • Overview
    • ⚖️Phala Network
      • 💎Phala Cloud
      • 🥷Dstack
      • 🔐GPU TEE
    • 💎PHA Token
      • 🪙Introduction
      • 👐Delegation
        • Delegate to StakePool
        • What is Vault
        • What is Share
        • WrappedBalances & W-PHA
        • Examples of Delegation
        • Use Phala App to Delegate
        • Estimate Your Reward
      • 🗳️Governance
        • Governance Mechanism
        • Join the Council
        • Voting for Councillors
        • Apply for Project Funding
        • Phala Treasury
        • Phala Governance
        • Setting Up an Account Identity
  • Phala Cloud
    • 🚀Getting Started
      • Create Your Phala Cloud Account
      • Your First CVM Deployment
      • Explore Templates
        • Launch an Eliza Agent
        • Start from Template
    • 🪨TEEs, Attestation & Zero Trust Security
      • Attestation
      • Security Architecture
    • 🥷Phala Cloud User Guides
      • Deploy and Manage CVMs
        • Deploy CVM with Docker Compose
        • Set Secure Environment Variables
        • Deploy Private Docker Image to CVM
        • Debugging and Analyzing Logs
          • Check Logs
          • Private Log Viewer
          • Debug Your Application
        • Application Scaling & Resource Management
        • Upgrade Application
        • Deployment Cheat Sheet
      • Building with TEE
        • Access Your Applications
        • Expose Service Port
        • Setting Up Custom Domain
        • Secure Access Database
        • Create Crypto Wallet
        • Generate Remote Attestation
      • Advanced Deployment Options
        • Deploy CVM with Phala Cloud CLI
        • Deploy CVM with Phala Cloud API
        • Setup a CI/CD Pipeline
    • 🚢Be Production Ready
      • CI/CD Automation
        • Setup a CI/CD Pipeline
      • Production Checklist
      • Troubleshooting Guide
      • Glossary
    • 🔒Use Cases
      • TEE with AI
      • TEE with FHE and MPC
      • TEE with ZK and ZKrollup
    • 📋References
      • Phala Cloud CLI Reference
        • phala
          • auth
          • cvms
          • docker
          • simulator
      • Phala Cloud API & SDKs
        • API Endpoints & Examples
        • SDKs and Integrations
      • Phala Cloud Pricing
    • ❓FAQs
  • Dstack
    • Overview
    • Getting Started
    • Hardware Requirements
    • Design Documents
      • Decentralized Root-of-Trust
      • Key Management Service
      • Zero Trust HTTPs (TLS)
    • Acknowledgement
  • LLM in GPU TEE
    • 👩‍💻Host LLM in GPU TEE
    • 🔐GPU TEE Inference API
    • 🏎️GPU TEE Benchmark
  • Tech Specs
    • ⛓️Blockchain
      • Blockchain Entities
      • Cluster of Workers
      • Secret Key Hierarchy
  • References
    • 🔐Setting Up a Wallet on Phala
      • Acquiring PHA
    • 🌉SubBridge
      • Cross-chain Transfer
      • Supported Assets
      • Asset Integration Guide
      • Technical Details
    • 👷Community Builders
    • 🤹Hackathon Guides
      • ETHGlobal Singapore
      • ETHGlobal San Francisco
      • ETHGlobal Bangkok
    • 🤯Advanced Topics
      • Cross Chain Solutions
      • System Contract and Drivers
      • Run Local Testnet
      • SideVM
    • 🆘Support
      • Available Phala Chains
      • Resource Limits
      • Transaction Costs
      • Compatibility Matrix
      • Block Explorers
      • Faucet
    • ⁉️FAQ
  • Compute Providers
    • 🙃Basic Info
      • Introduction
      • Gemini Tokenomics (Worker Rewards)
      • Budget balancer
      • Staking Mechanism
      • Requirements in Phala
      • Confidence Level & SGX Function
      • Rent Hardware
      • Error Summary
    • 🦿Run Workers on Phala
      • Solo Worker Deployment
      • PRBv3 Deployment
      • Using PRBv3 UI
      • PRB Worker Deployment
      • Switch Workers from Solo to PRB Mode
      • Headers-cache deployment
      • Archive node deployment
    • 🛡️Gatekeeper
      • Collator
      • Gatekeeper
  • Web Directory
    • Discord
    • GitHub
    • Twitter
    • YouTube
    • Forum
    • Medium
    • Telegram
Powered by GitBook
LogoLogo

Participate

  • Compute Providers
  • Node
  • Community
  • About Us

Resources

  • Technical Whitepaper
  • Token Economics
  • Docs
  • GitHub

More

  • Testnet
  • Explorer
  • Careers
  • Responsible Disclosure

COPYRIGHT © 2024 PHALA.LTD ALL RIGHTS RESERVED. May Phala be with you!

On this page
  • Overview
  • Implementation
  • References

Was this helpful?

Edit on GitHub
  1. LLM in GPU TEE

Host LLM in GPU TEE

PreviousAcknowledgementNextGPU TEE Inference API

Last updated 1 month ago

Was this helpful?

Overview

Private AI or called confidential AI addresses critical concerns such as data privacy, secure execution, and computation verifiability, making it indispensable for sensitive applications. As illustrated in the diagram below, people currently cannot fully trust the responses returned by LLMs from services like OpenAI or Meta, due to the lack of cryptographic verification. By running the LLM inside a TEE, we can add verification primitives alongside the returned response, known as a Remote Attestation (RA) Report. This allows users to verify the AI generation results locally without relying on any third parties.

Implementation

References

The implementation for running LLMs in GPU TEE is available in the GitHub repository. This project is built by Phala Network and was made possible through a grant from NEARAI. The SDK provides the necessary tools and infrastructure to deploy and run LLMs securely within GPU TEE.

👩‍💻
private-ml-sdk
HCC-Whitepaper
Intel SGX DCAP Orientation
Phala's dcap-qvl
Automata's Solidity Implementation
Phala Nvidia H200 TEE Benchmark Paper
Phala DeRoT Post on FlashBots forum
Phala Key Management Protocol Post on Flashbots forum
Private ML SDK