List Models

GET https://zenmux.ai/api/vertex-ai/v1beta/models

This endpoint retrieves information about the available models on the platform that are compatible with the Google Vertex AI API protocol.

Request params

This endpoint does not require any request parameters.

Returns

Returns a JSON object containing information about all available models.

models `array`

An array of models, containing detailed information for all available models.

models object

name `string`

The model’s unique identifier, in the format <provider>/<model_name>.

displayName `string`

The model’s display name, used for UI display.

description `string`

A description of the model.

inputTokenLimit `integer`

The maximum number of input tokens allowed for this model.

outputTokenLimit `integer`

The maximum number of output tokens available for this model.

thinking `boolean`

Whether reasoning capability is supported. true means supported, false means not supported.

inputModalities `array`

The input modalities supported by the model. Possible values include:

"text" - Text input
"image" - Image input
"video" - Video input
"audio" - Audio input
"file" - File input

outputModalities `array`

The output modalities supported by the model. Possible values include:

"text" - Text output
"image" - Image output
"video" - Video output
"audio" - Audio output
"file" - File output

pricings `object`

A pricing information object that contains various price configurations for model usage.

pricings.prompt `array`

An array of price configurations for processing input text.

pricings.completion `array`

An array of price configurations for generated output text.

pricings.input_cache_read `array`

An array of price configurations for reading input data from cache.

pricings.input_cache_write_5_min `array`

An array of price configurations for writing to cache with a 5-minute retention period.

pricings.input_cache_write_1_h `array`

An array of price configurations for writing to cache with a 1-hour retention period.

pricings.input_cache_write `array`

An array of price configurations for writing to cache.

pricings.web_search `array`

An array of price configurations for invoking web search (optional; supported by some models).

pricings.internal_reasoning `array`

An array of price configurations for the model’s internal reasoning process (optional; supported by some advanced reasoning models). When the model enables an internal chain-of-thought or detailed reasoning process, additional charges apply.

pricings.video `array`

An array of price configurations for processing video output (optional; for models that support video understanding). Billed by video duration, resolution, or frame count.

pricings.image `array`

An array of price configurations for processing image output (optional; for models that support image understanding). Typically billed by image count, resolution, or pixel count.

pricings.audio `array`

An array of price configurations for processing audio output (optional; for models that support audio understanding). Billed by audio duration or processing volume.

pricings.audio_and_video `array`

An array of price configurations for generating video content with audio (optional; for models that support audio-video multimodal understanding). Applicable to scenarios that require analyzing both video frames and audio content. Note: there are two video generation scenarios—silent video uses pricings.video, while video with audio uses pricings.audio_and_video.

Pricing item structure

Each pricing array in the pricings object (such as completion, prompt, etc.) contains one or more pricing configuration objects. Each pricing configuration object includes the following fields:

value `number`

The effective discounted price for the model; free services show as 0.

unit `string`

The pricing unit. Possible values include:

"perMTokens" - Per million tokens
"perCount" - Per call
"perSecond" - Per second (for time-based billing scenarios such as audio/video)

currency `string`

The currency type, fixed as "USD", meaning US dollars.

conditions `object`

Pricing effective conditions (optional), commonly used for tiered pricing.

conditions.prompt_tokens `object`

A token-count condition for the input content provided by the user.

conditions.completion_tokens `object`

A token-count condition for tokens consumed by the model when generating the response.

Pricing condition structure

When a pricing configuration includes the conditions field, it defines the specific conditions under which the price takes effect. The condition objects for prompt_tokens and completion_tokens include the following fields:

unit `string`

The token measurement unit, fixed as "kTokens" meaning thousand tokens (1000 tokens).

gte `number`

Minimum token count (inclusive). The actual token count must be ≥ this value.

lte `number`

Maximum token count (inclusive). The actual token count must be ≤ this value.

gt `number`

Minimum token count (exclusive). The actual token count must be > this value.

lt `number`

Maximum token count (exclusive). The actual token count must be < this value; null means no upper limit.

json

{
  "models": [
    {
      "name": "google/gemini-2.5-flash-lite",
      "displayName": "Google: Gemini 2.5 Flash Lite",
      "description": "Gemini 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance across common benchmarks compared to earlier Flash models. By default, \"thinking\" (i.e. multi-pass reasoning) is disabled to prioritize speed, but developers can enable it via the [Reasoning API parameter](https://openrouter.ai/docs/use-cases/reasoning-tokens) to selectively trade off cost for intelligence. ",
      "inputTokenLimit": 1048576,
      "outputTokenLimit": 65535,
      "thinking": true,
      "inputModalities": ["file", "image", "text", "audio"],
      "outputModalities": ["text"],
      "pricings": {
        "completion": [
          {
            "value": 1,
            "unit": "perMTokens",
            "currency": "USD",
            "conditions": {
              "prompt_tokens": {
                "unit": "kTokens",
                "gte": 0
              }
            }
          }
        ],
        "prompt": [
          {
            "value": 1,
            "unit": "perMTokens",
            "currency": "USD",
            "conditions": {
              "prompt_tokens": {
                "unit": "kTokens",
                "gte": 0
              }
            }
          }
        ]
      }
    }
  ]
}

cURL

curl https://zenmux.ai/api/vertex-ai/v1beta/models

List Models ​

Request params ​

Returns ​

models array ​

models object ​

name string ​

displayName string ​

description string ​

inputTokenLimit integer ​

outputTokenLimit integer ​

thinking boolean ​

inputModalities array ​

outputModalities array ​

pricings object ​

pricings.prompt array ​

pricings.completion array ​

pricings.input_cache_read array ​

pricings.input_cache_write_5_min array ​

pricings.input_cache_write_1_h array ​

pricings.input_cache_write array ​

pricings.web_search array ​

pricings.internal_reasoning array ​

pricings.video array ​

pricings.image array ​

pricings.audio array ​

pricings.audio_and_video array ​

Pricing item structure ​

value number ​

unit string ​

currency string ​

conditions object ​

conditions.prompt_tokens object ​

conditions.completion_tokens object ​

Pricing condition structure ​

unit string ​

gte number ​

lte number ​

gt number ​

lt number ​

List Models

Request params

Returns

models `array`

models object

name `string`

displayName `string`

description `string`

inputTokenLimit `integer`

outputTokenLimit `integer`

thinking `boolean`

inputModalities `array`

outputModalities `array`

pricings `object`

pricings.prompt `array`

pricings.completion `array`

pricings.input_cache_read `array`

pricings.input_cache_write_5_min `array`

pricings.input_cache_write_1_h `array`

pricings.input_cache_write `array`

pricings.web_search `array`

pricings.internal_reasoning `array`

pricings.video `array`

pricings.image `array`

pricings.audio `array`

pricings.audio_and_video `array`

Pricing item structure

value `number`

unit `string`

currency `string`

conditions `object`

conditions.prompt_tokens `object`

conditions.completion_tokens `object`

Pricing condition structure

unit `string`

gte `number`

lte `number`

gt `number`

lt `number`