Phala Network Docs
  • Home
    • 👾Phala Network Docs
  • Overview
    • ⚖️Phala Network
      • 💎Phala Cloud
      • 🥷Dstack
      • 🔐GPU TEE
    • 💎PHA Token
      • 🪙Introduction
      • 👐Delegation
        • Delegate to StakePool
        • What is Vault
        • What is Share
        • WrappedBalances & W-PHA
        • Examples of Delegation
        • Use Phala App to Delegate
        • Estimate Your Reward
      • 🗳️Governance
        • Governance Mechanism
        • Join the Council
        • Voting for Councillors
        • Apply for Project Funding
        • Phala Treasury
        • Phala Governance
        • Setting Up an Account Identity
  • Phala Cloud
    • 🚀Getting Started
      • Create Your Phala Cloud Account
      • Your First CVM Deployment
      • Explore Templates
        • Launch an Eliza Agent
        • Start from Template
    • 🪨TEEs, Attestation & Zero Trust Security
      • Attestation
      • Security Architecture
    • 🥷Phala Cloud User Guides
      • Deploy and Manage CVMs
        • Deploy CVM with Docker Compose
        • Set Secure Environment Variables
        • Deploy Private Docker Image to CVM
        • Debugging and Analyzing Logs
          • Check Logs
          • Private Log Viewer
          • Debug Your Application
        • Application Scaling & Resource Management
        • Upgrade Application
        • Deployment Cheat Sheet
      • Building with TEE
        • Access Your Applications
        • Expose Service Port
        • Setting Up Custom Domain
        • Secure Access Database
        • Create Crypto Wallet
        • Generate Remote Attestation
      • Advanced Deployment Options
        • Deploy CVM with Phala Cloud CLI
        • Deploy CVM with Phala Cloud API
        • Setup a CI/CD Pipeline
    • 🚢Be Production Ready
      • CI/CD Automation
        • Setup a CI/CD Pipeline
      • Production Checklist
      • Troubleshooting Guide
      • Glossary
    • 🔒Use Cases
      • TEE with AI
      • TEE with FHE and MPC
      • TEE with ZK and ZKrollup
    • 📋References
      • Phala Cloud CLI Reference
        • phala
          • auth
          • cvms
          • docker
          • simulator
      • Phala Cloud API & SDKs
        • API Endpoints & Examples
        • SDKs and Integrations
      • Phala Cloud Pricing
    • ❓FAQs
  • Dstack
    • Overview
    • Getting Started
    • Hardware Requirements
    • Design Documents
      • Decentralized Root-of-Trust
      • Key Management Service
      • Zero Trust HTTPs (TLS)
    • Acknowledgement
    • ❓FAQs
  • LLM in GPU TEE
    • 👩‍💻Host LLM in GPU TEE
    • 🔐GPU TEE Inference API
    • 🏎️GPU TEE Benchmark
    • ❓FAQs
  • Tech Specs
    • ⛓️Blockchain
      • Blockchain Entities
      • Cluster of Workers
      • Secret Key Hierarchy
  • References
    • 🔐Setting Up a Wallet on Phala
      • Acquiring PHA
    • 🌉SubBridge
      • Cross-chain Transfer
      • Supported Assets
      • Asset Integration Guide
      • Technical Details
    • 👷Community Builders
    • 🤹Hackathon Guides
      • ETHGlobal Singapore
      • ETHGlobal San Francisco
      • ETHGlobal Bangkok
    • 🤯Advanced Topics
      • Cross Chain Solutions
      • System Contract and Drivers
      • Run Local Testnet
      • SideVM
    • 🆘Support
      • Available Phala Chains
      • Resource Limits
      • Transaction Costs
      • Compatibility Matrix
      • Block Explorers
      • Faucet
    • ⁉️FAQ
  • Compute Providers
    • 🙃Basic Info
      • Introduction
      • Gemini Tokenomics (Worker Rewards)
      • Budget balancer
      • Staking Mechanism
      • Requirements in Phala
      • Confidence Level & SGX Function
      • Rent Hardware
      • Error Summary
    • 🦿Run Workers on Phala
      • Solo Worker Deployment
      • PRBv3 Deployment
      • Using PRBv3 UI
      • PRB Worker Deployment
      • Switch Workers from Solo to PRB Mode
      • Headers-cache deployment
      • Archive node deployment
    • 🛡️Gatekeeper
      • Collator
      • Gatekeeper
  • Web Directory
    • Discord
    • GitHub
    • Twitter
    • YouTube
    • Forum
    • Medium
    • Telegram
Powered by GitBook
LogoLogo

Participate

  • Compute Providers
  • Node
  • Community
  • About Us

Resources

  • Technical Whitepaper
  • Token Economics
  • Docs
  • GitHub

More

  • Testnet
  • Explorer
  • Careers
  • Responsible Disclosure

COPYRIGHT © 2024 PHALA.LTD ALL RIGHTS RESERVED. May Phala be with you!

On this page
  • What is the relationship between dstack Private ML SDK, and what's the vllm-proxy doing there?
  • Do I need CUDA or NVIDIA drivers on the host for Private ML SDK with GPU support?
  • Is CUDA directly accessible in Phala GPU TEE?
  • Can I run my app in a docker container with access to GPU TEE under Intel TDX? Is this similar to Google Cloud's Confidential Space?
  • Is dstack-based including private-ml-sdk’s CVM running on bare metal or a hypervisor?

Was this helpful?

Edit on GitHub
  1. LLM in GPU TEE

FAQs

What is the relationship between dstack Private ML SDK, and what's the vllm-proxy doing there?

The Private ML SDK leverages the TEE features provided by Intel and NVIDIA GPUs to support running any GPU workload as a Docker container in a GPU TEE. In the Private ML SDK, dstack is a framework that can run Docker containers, while vllm-proxy is a specific container within it. The vllm-proxy container operates as a server that forwards requests to a vllm container (hosting a large language model) and generates quotes to attach to responses. This separation allows vllm to remain unmodified and compatible with its latest versions.

Do I need CUDA or NVIDIA drivers on the host for Private ML SDK with GPU support?

No, CUDA and NVIDIA drivers are not required on the host for the Private ML SDK with GPU support. The host should not have NVIDIA drivers installed to avoid conflicts with TDX and VFIO passthrough. Drivers (e.g., 550.54.15) and CUDA (e.g., 12.4) are loaded inside the guest CVM via Docker containers (e.g., kvin/cuda-notebook). The vllm and vllm-proxy services also run entirely within the TEE guest, not on the host.

Is CUDA directly accessible in Phala GPU TEE?

Yes, CUDA works in the GPU TEE environment. By default, you don’t need to install it on the host environment when using private-ml-sdk to create a CVM. If you want to use CUDA in your CVM, you can install it manually inside the CVM. You can refer to our benchmark analysis with Succinct, where the SP1 VM also involves CUDA.

Can I run my app in a docker container with access to GPU TEE under Intel TDX? Is this similar to Google Cloud's Confidential Space?

Yes, you can build your application into a docker image and prepare a docker-compose file. The SDK will create a TDX-based virtual machine to launch your application. However, unlike Google Cloud's Confidential Space, which uses virtual machines, dstack and private-ml-sdk require a bare-metal environment.

Is dstack-based including private-ml-sdk’s CVM running on bare metal or a hypervisor?

DStack is launched on a bare-metal environment. However, for CVMs created by dstack, a hypervisor (qemu) is used to launch the VM with TDX enabled.

PreviousGPU TEE BenchmarkNextBlockchain

Last updated 3 days ago

Was this helpful?

❓