💎Confidential AI API
Overview
When the service start, it will generate a signing key in TEE.
You can get the CPU and GPU attestation to verify the service is running in Confidential VM with NVIDIA H100 in TEE mode.
The attestation includes the public key of the signing key to prove the key is generated in TEE.
All the inference results contain signature with the signing key.
You can use the public key to verify all the inference results is generated in TEE.
Endpoint
https://inference-api.phala.network/
Attestation & Public Key
Request
GET https://inference-api.phala.network/v1/attestation/report
Sample Response
Verify the Attestation
Verify GPU Attestation
You can copy the value of nvidia_payload
as the whole payload as followed to verify:
Verify TDX Quote
Theoretically, you can verify the Intel TDX quote with the value of intel_quote
at anywhere that provide TDX quote verification service. The screenshot below is an example of how to verify the Intel TDX quote with the Automata's on-chain attestation smart contract. For Automata example, just need to convert the returned base64 encoded quote to hex format (take Node for example).
Chat API
OpenAI-compatible API. See: https://platform.openai.com/docs/api-reference/chat
Request
Endpoint: POST https://inference-api.phala.network/v1/chat/completions
model
in the request body: currently we only support:
meta-llama/meta-llama-3.1-8b-instruct
google/gemma-2-9b-it
microsoft/phi-3-mini-4k-instruct
Sample Request
That sha256 of the request body is bcf152411970b14faab35a76d559b4188b78c24ced0048d0edcd320bf47bff0a
(note: in this example, there is no new line in the end of request)
Sample Response
The sha256sum of response body is 2e704942816901eaf435945fd01d76346e5dd283d5f8a2391e525e8f9a9ef36e
(note: in this example, there are two new line in the end of response)
Signature
By default, you can query another API with the value of id
in the response in 5 minutes. With this way, you can have maximum compatible with OpenAPI with your existing code.
Request
GET https://inference-api.phala.network/v1/signature/{request_id}
For example, the response in the previous section, the id
is chat-7ee2a39468ce48d7b2284783f21782b0
:
GET https://inference-api.phala.network/v1/signature/chat-7ee2a39468ce48d7b2284783f21782b0
Response
Text: the message you may want to verify. It is joined by the sha256 of the HTTP request body, and of the HTTP response body, separated by a colon
:
.Signature.
Sample Response
We can see that the text
is bcf152411970b14faab35a76d559b4188b78c24ced0048d0edcd320bf47bff0a:2e704942816901eaf435945fd01d76346e5dd283d5f8a2391e525e8f9a9ef36e
Exactly match the value we calculated in the sample in previous section.
Limitation
Since the resource limitation, the signature will be kept in the memory for 5 minutes since the response is generated.
Verify Signature
Go to https://etherscan.io/verifiedSignatures, click Verify Signature
:
Address: You can get the address from the attestation API. The address should be same if the service did not restarted.
Message: see the Response of the Signature section. You can also calculate the sha256 by yourselves.
Signature Hash: See the Signature section.
Last updated