🔐GPU TEE Inference API
Phala Cloud provides LLM inference API by connecting to Redpill. By navigating to Dashboard->GPU TEE API page, you will see details.
Enable API Service
In order to use the API, user needs to deposit at least $5 in advance to enable a Redpill account if your balance is not sufficient. You can head to Dashboard->Billing page then click Deposit button to top-up with your bank card or crypto wallet.

Generate API Key
At the bottom of the page, click the Enable button to connect your Cloud account with a Redpill account, and then click the button Create New API Key. Copy the key to use when you interact with Redpill API.

Redpill is a models marketplace that supports private AI inference. It currently supports two models that are running in GPU TEE, you can view them in the models page by clicking the GPU TEE
checkbox:

Chat With Private AI
We provide OpenAI-compatible API for you to send chat requests to the LLM running inside TEE, where you just need to use the API endpoint https://api.red-pill.ai/v1/chat/completions
. A simple request could be like:
curl -X 'POST' \
'https://api.red-pill.ai/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <The API Key you generated previously>' \
-d '{
"messages": [
{
"content": "You are a helpful assistant.",
"role": "system"
},
{
"content": "What is your model name?",
"role": "user"
}
],
"stream": true,
"model": "phala/deepseek-r1-70b"
}'
Sample Response
...
data: {"id":"chatcmpl-0cdf7629fcfa4135bbdb9936e737e95c","object":"chat.completion.chunk","created":1740415146,"model":"/mnt/models/deepseek-r1-70b/deepseek-r1-70b.guff","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":"stop","stop_reason":128001}]}
data: [DONE]
Get TEE Attestation Report
You can verify if the LLM is running in GPU TEE. This can be done by verifying its attestation report. To get the attestation report of the LLM inference, you can do this by sending a POST request to the Redpill API endpoint like below:
curl 'https://api.red-pill.ai/v1/attestation/report?model=phala/deepseek-r1-70b' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <The API Key you generated previously>'
The response will be like:
{
"signing_address": "...",
"nvidia_payload": "...",
"intel_quote": "...",
"all_attestations": [
{
"signing_address": "...",
"nvidia_payload": "...",
"intel_quote": "..."
}
]
}
The signing_address
is the account address generated inside TEE that will be used to sign the chat message later.
The all_attestations
is the list of all the attestations of all GPU nodes since we add more TEE nodes to serve the inference requests. You can utilize the signing_address
from the all_attestations
to select the appropriate TEE node for verifying its integrity.
Verify Attestation Report
Verify GPU Attestation Report
You can copy the value of nvidia_payload as the whole payload as followed to verify:
curl -X POST https://nras.attestation.nvidia.com/v3/attest/gpu \
-H "accept: application/json" \
-H "content-type: application/json" \
-d "<NVIDIA_PAYLOAD_FROM_ABOVE>"
Verify TDX Attestation Report
You can verify the Intel TDX Attestation Report, aka quote with the value of intel_quote
at TEE Attestation Explorer.
The signing_address
is the account address generated inside TEE that will be used to sign the chat response. You can go to https://etherscan.io/verifiedSignatures, click Verify Signature, and paste the signing_address
and message response to verify it.
nvidia_payload
and intel_quote
are the attestation report from NVIDIA TEE and Intel TEE respectively. You can use them to verify the integrity of the TEE. See Verify the Attestation for more details.
Note: The trust chain works as follows: when you verify the attestation report, you trust the model provider (Redpill) and the TEE providers (NVIDIA and Intel). You then trust the open-source, reproducible code by verifying the source code here. Finally, you trust the cryptographic key derived inside the TEE. This is why we only need to verify the signature of the message during chat.
Verify Chat Signature
If you chat with the LLM, the response will contain an id
which you can use to get the chat Signature later.
Sample Request
curl -X 'POST' \
'https://api.red-pill.ai/v1/chat/completions' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer <REDPILL_API_KEY>' \
-d '{
"messages": [
{
"content": "You are a helpful assistant.",
"role": "system"
},
{
"content": "What is your model name?",
"role": "user"
}
],
"stream": true,
"model": "phala/deepseek-r1-70b"
}'
That sha256 of the request body is e5542b0757e0b9d05bfa4a15da7bac97a03bd35d21b648ec492152708e795ff9
(note: in this example, there is no new line in the end of request)
Simple Response
...
data: {"id":"chatcmpl-0cdf7629fcfa4135bbdb9936e737e95c","object":"chat.completion.chunk","created":1740415146,"model":"/mnt/models/deepseek-r1-70b/deepseek-r1-70b.guff","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":"stop","stop_reason":128001}]}
data: [DONE]
The sha256sum of response body is 7a97926adb2044fd598b392eee98ad8f7c39ea3a47747ca968ef755bbf57c211
(note: in this example, there are two new line in the end of response)
The id
is calculated by sha256sum(sha256sum(request_body) + sha256sum(response_body)).
Request Chat Signature
By default, you can query another API with the value of id in the response in 30 minutes.
Request
GET https://api.red-pill.ai/v1/signature/{request_id}?model={model_id}&signing_algo=ecdsa
For example, the response in the previous section, the id is chatcmpl-0cdf7629fcfa4135bbdb9936e737e95c
:
Response
{
"text": "e5542b0757e0b9d05bfa4a15da7bac97a03bd35d21b648ec492152708e795ff9:7a97926adb2044fd598b392eee98ad8f7c39ea3a47747ca968ef755bbf57c211",
"signature": "faf0316a4860fd3d412cb5851b55687edc31f5600b4667502cf32112e1ad533b5d6420beb1fd7002334a46d897e11347837675bc01982485e00549091b06f8a81b",
"signing_algo": "ecdsa"
}
text: the message you may want to verify. It is joined by the sha256 of the HTTP request body, and of the HTTP response body, separated by a colon :.
signature: the signature data.
signing_algo: The cryptographic scheme that the signer private key generated.
Exactly match the value we calculated in the sample in previous section.
Limitation
Since the resource limitation, the signature will be kept in the memory for 5 minutes since the response is generated.
Verify Signature on etherscan
Go to https://etherscan.io/verifiedSignatures, click Verify Signature:
Address: You can get the address from the attestation API. The address should be same if the service did not restart.
Message: see the Response of the Signature section. You can also calculate the sha256 by yourself.
Signature Hash: See the Signature section.

Last updated
Was this helpful?