Skip to content
GET
Lang

Get generation

GET https://zenmux.ai/api/v1/generation?id=<generation_id>

The Get generation endpoint is used to retrieve generation details, such as usage and costs.

TIP

This endpoint supports retrieving generation details for all API protocols, including OpenAI Chat Completions, OpenAI Responses, Anthropic, and Vertex AI.

⚠️ Subscription Plan Limitations

This endpoint only supports billing queries for Pay As You Go API keys. If you call this endpoint with a subscription-plan API key (prefixed with sk-ss-v1-), billing-related fields (such as usage, ratingResponses, etc.) will not be returned.

To retrieve billing information, please use a Pay As You Go API key. See:

Metering and Billing Information

Metering (Token Usage)

Metering data (e.g., token usage in the nativeTokens field) is returned synchronously with the request in the protocol’s native format:

  • OpenAI Chat Completions protocol: returned in the response usage field
  • OpenAI Responses protocol: returned in the response usage field
  • Anthropic protocol: returned in the response usage field
  • Vertex AI protocol: returned in the response usageMetadata field

Billing (Billing & Costs)

Billing data (cost-related fields such as usage, ratingResponses, etc.) is not currently returned synchronously with the request. After the request completes, you must query it via this endpoint 3–5 minutes later.

💡 Billing upgrade in progress

We’re improving and upgrading our billing architecture to enable synchronous billing data in responses as soon as possible. Stay tuned!

Request params

Authorization Header

Header parameters:

http
Authorization: Bearer <ZENMUX_API_KEY>
  • Name: Authorization
  • Format: Bearer <API_KEY>
  • Description: Your ZenMux API key
    • Pay As You Go API key: supports querying full metering and billing information
    • Subscription API key (prefixed with sk-ss-v1-): supports metering only; billing information is not supported

💡 Get an API key

generate_id string

Query parameters:

The generation id returned by ZenMux API endpoints. You can obtain it from:

Returns

api string

API type. Values vary by protocol:

  • chat.completions - OpenAI Chat Completions protocol
  • responses - OpenAI Responses protocol
  • messages - Anthropic protocol
  • generateContent - Vertex AI protocol

generationId string

The current generation id.

model string

Model ID.

createAt string

The time when the server received the inference request.

generationTime integer

Total duration of this inference from first token to completion, in milliseconds.

latency integer

Time to first token, in milliseconds.

nativeTokens object

Usage information consumed by this inference, including:

  • completion_tokens integer - Tokens used for the completion
  • prompt_tokens integer - Tokens used for the prompt
  • total_tokens integer - Total tokens
  • completion_tokens_details object - Completion token details
    • reasoning_tokens integer - Tokens used for reasoning
  • prompt_tokens_details object - Prompt token details
    • cached_tokens integer - Cached tokens

streamed boolean

Whether the response is streamed.

finishReason string

The reason the model stopped generating.

usage number

Credits consumed by this inference.

ratingResponses object

Billing response details, including:

  • billAmount number - Billed amount
  • discountAmount number - Discount amount
  • originAmount number - Original amount
  • priceVersion string - Price version
  • ratingDetails array - Billing detail items, each containing:
    • billAmount number - Billed amount
    • discountAmount number - Discount amount
    • feeItemCode string - Fee item code (e.g., completion, prompt)
    • originAmount number - Original amount
    • rate number - Rate

requestRetryTimes integer

Number of request retries.

finalRetry boolean

Whether this is the final retry.

cURL
curl https://zenmux.ai/api/v1/generation?id=<generation_id> \
  -H "Authorization: Bearer $ZENMUX_API_KEY"