Skip to content

Image Generation - Google Gemini Protocol

ZenMux currently supports two image generation protocols: Google Gemini and OpenAI Images.

All image generation models on ZenMux can be called through the Google Gemini protocol. OpenAI image generation models additionally support both the OpenAI Images protocol and the Google Gemini protocol.

This guide focuses on the Google Gemini protocol. If you want to call OpenAI image models through the OpenAI Images protocol, see OpenAI Images protocol image generation.

ZenMux supports invoking image generation models through the Vertex AI protocol. This guide explains how to use ZenMux to generate images and save them locally.

💡 About Banana Models

Banana is a series of image generation models from Google that can produce high-quality images from text prompts. You can use these models in ZenMux through the Vertex AI protocol.

Supported Models

ZenMux continuously updates its image generation models. Visit the ZenMux model catalog to view all currently supported image generation models that support the Google Gemini protocol.

📚 OpenAI Images Protocol Models

If you only want to view OpenAI image models that support the OpenAI Images protocol, visit the OpenAI image model list.

Reference Documentation

This guide only covers basic usage. For detailed configuration and advanced usage, refer to the official documentation below:

Usage

Python
from google import genai
from google.genai import types

client = genai.Client(
    api_key="$ZENMUX_API_KEY",  # Replace with your API Key
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://zenmux.ai/api/vertex-ai'
    ),
)

# Streaming: generate_content_stream
# Non-streaming: generate_content
prompt = "Create a picture of a nano banana dish in a fancy restaurant with a Gemini theme"

response = client.models.generate_content(
    model="google/gemini-3-pro-image-preview",
    contents=[prompt],
    config=types.GenerateContentConfig(
        response_modalities=["TEXT", "IMAGE"]
    )
)

# Process text and image responses
for part in response.parts:
    if part.text is not None:
        print(part.text)
    elif part.inline_data is not None:
        # Save the generated image
        image = part.as_image()
        image.save("generated_image.png")
        print("Image saved as generated_image.png")
ts
const genai = require("@google/genai");

const client = new genai.GoogleGenAI({
  apiKey: "$ZENMUX_API_KEY", // Replace with your API Key
  vertexai: true,
  httpOptions: {
    baseUrl: "https://zenmux.ai/api/vertex-ai",
    apiVersion: "v1",
  },
});

// Streaming: generateContentStream
// Non-streaming: generateContent
const response = await client.models.generateContent({
  model: "google/gemini-3-pro-image-preview",
  contents:
    "Generate an image of the Eiffel tower with fireworks in the background",
  config: {
    responseModalities: ["TEXT", "IMAGE"], // Response modalities must be specified
    // For more configuration options, see the Vertex AI documentation
  },
});

console.log(response);

Non-Google Model Usage

For non-Google models such as openai/gpt-image-1.5, openai/gpt-image-2, and qwen/qwen-image-2.0, use the generate_images and edit_image APIs.

ZenMux internally translates Vertex AI protocol parameters into the OpenAI image generation API format. The mapping tables below help you understand how to use each feature.

Supported Parameters

generate_images Parameters

The following table maps the official OpenAI image generation parameters to the ZenMux Vertex AI protocol:

OpenAI ParameterVertex AI EquivalentTypeDescriptionSupported
promptprompt (passed directly via SDK)stringText description (required)
modelmodel (passed directly via SDK)stringModel name
nconfig.number_of_imagesnumberNumber of images to generate (1-10)
sizeconfig.http_options.extra_body.imageSizestringImage size (passthrough). Common values: 1024x1024, 1536x1024 (landscape), 1024x1536 (portrait), auto. Other sizes accepted by the underlying model are also supported — refer to the OpenAI docs and the passthrough note below.
qualityconfig.http_options.extra_body.qualitystringImage quality (passthrough). Common values: low / medium / high / auto. Any value supported by the underlying model can be used — refer to the OpenAI docs.
output_formatconfig.output_mime_typestringOutput format: image/png, image/jpeg, image/webp
output_compressionconfig.output_compression_qualitynumberCompression quality (0-100), only valid for webp/jpeg
backgroundstringBackground transparency setting
moderationstringContent moderation level
stylestringDALL-E 3 style parameter
response_formatstringNot applicable (always returns base64)

edit_image Parameters

The following table maps the official OpenAI image editing parameters to the ZenMux Vertex AI protocol:

OpenAI ParameterVertex AI EquivalentTypeDescriptionSupported
promptprompt (passed directly via SDK)stringEdit description (required)
modelmodel (passed directly via SDK)stringModel name
imagereference_images (with referenceType other than MASK)file/base64Reference images, multiple supported
maskreference_images (with referenceType = REFERENCE_TYPE_MASK)file/base64Mask image; transparent areas are the editable region
nconfig.number_of_imagesnumberNumber of images (1-10)
sizeconfig.http_options.extra_body.imageSizestringImage size (passthrough). Common values: 1024x1024, 1536x1024, 1024x1536, auto. Other sizes accepted by the underlying model are also supported — refer to the OpenAI docs.
qualityconfig.http_options.extra_body.qualitystringImage quality (passthrough). Common values: low / medium / high / auto. Any value supported by the underlying model can be used — refer to the OpenAI docs.
output_formatconfig.output_mime_typestringOutput format
output_compressionconfig.output_compression_qualitynumberCompression quality
backgroundNot supported

💡 About Parameter Passing

imageSize and quality are passthrough parameters — ZenMux forwards them as-is to the underlying OpenAI-compatible image generation API without validation or transformation. The values shown in the tables above are the most common options; for the complete list of supported values and behavioral details, refer to the official OpenAI documentation for image generation and image editing, and configure them based on your needs.

Pass these OpenAI-specific parameters through httpOptions.extraBody to ensure consistent behavior across the Python and TypeScript SDKs. Standard Vertex AI fields such as numberOfImages, outputMimeType, and outputCompressionQuality can be set directly at the top level of config.

Generating Images

Python
from google import genai
from google.genai import types

client = genai.Client(
    api_key="$ZENMUX_API_KEY",  # Replace with your API Key
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://zenmux.ai/api/vertex-ai'
    ),
)

response = client.models.generate_images(
    model="openai/gpt-image-2",  # or qwen/qwen-image-2.0
    prompt="A cat and a dog"
)

# Save the generated images
for i, img in enumerate(response.generated_images):
    img.image.save(f"generated_{i}.png")
    print(f"Image saved as generated_{i}.png")
ts
const { GoogleGenAI } = require("@google/genai");

const client = new GoogleGenAI({
  apiKey: "$ZENMUX_API_KEY", // Replace with your API Key
  vertexai: true,
  httpOptions: {
    baseUrl: "https://zenmux.ai/api/vertex-ai",
    apiVersion: "v1",
  },
});

const response = await client.models.generateImages({
  model: "openai/gpt-image-2", // or qwen/qwen-image-2.0
  prompt: "A cat and a dog",
});

console.log(response);

Advanced Parameter Examples

Generating High-Resolution Images

Pass imageSize and quality through httpOptions.extraBody to generate high-quality, large-size images:

Python
from google import genai
from google.genai import types

client = genai.Client(
    api_key="$ZENMUX_API_KEY",
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://zenmux.ai/api/vertex-ai'
    ),
)

response = client.models.generate_images(
    model="openai/gpt-image-2",
    prompt="A futuristic cityscape at sunset, ultra detailed",
    config=types.GenerateImagesConfig(
        number_of_images=1,                         # Generate 1 image
        http_options=types.HttpOptions(
            extra_body={
                "imageSize": "1536x1024",           # Landscape high resolution
                "quality": "high",                  # High quality
            }
        ),
    )
)

for i, img in enumerate(response.generated_images):
    img.image.save(f"hd_{i}.png")
    print(f"HD image saved as hd_{i}.png")
ts
const { GoogleGenAI } = require("@google/genai");

const client = new GoogleGenAI({
  apiKey: "$ZENMUX_API_KEY",
  vertexai: true,
  httpOptions: {
    baseUrl: "https://zenmux.ai/api/vertex-ai",
    apiVersion: "v1",
  },
});

const response = await client.models.generateImages({
  model: "openai/gpt-image-2",
  prompt: "A futuristic cityscape at sunset, ultra detailed",
  config: {
    numberOfImages: 1,                  // Generate 1 image
    httpOptions: {
      extraBody: {
        imageSize: "1536x1024",         // Landscape high resolution
        quality: "high",                // High quality
      },
    },
  },
});

for (const img of response.generatedImages) {
  console.log("Image generated:", img.image.imageBytes.length, "bytes");
}

Generating 4K Images

openai/gpt-image-2 supports custom sizes, including 4K resolution:

Python
from google import genai
from google.genai import types

client = genai.Client(
    api_key="$ZENMUX_API_KEY",
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://zenmux.ai/api/vertex-ai'
    ),
)

# Generate a 4K landscape image (3840x2160)
response = client.models.generate_images(
    model="openai/gpt-image-2",
    prompt="A breathtaking mountain landscape at golden hour, photorealistic",
    config=types.GenerateImagesConfig(
        number_of_images=1,
        http_options=types.HttpOptions(
            extra_body={
                "imageSize": "3840x2160",   # 4K UHD resolution
                "quality": "high",          # High quality
            }
        ),
    )
)

for i, img in enumerate(response.generated_images):
    img.image.save(f"4k_{i}.png")
    print(f"4K image saved as 4k_{i}.png")
ts
const { GoogleGenAI } = require("@google/genai");

const client = new GoogleGenAI({
  apiKey: "$ZENMUX_API_KEY",
  vertexai: true,
  httpOptions: {
    baseUrl: "https://zenmux.ai/api/vertex-ai",
    apiVersion: "v1",
  },
});

// Generate a 4K landscape image (3840x2160)
const response = await client.models.generateImages({
  model: "openai/gpt-image-2",
  prompt: "A breathtaking mountain landscape at golden hour, photorealistic",
  config: {
    numberOfImages: 1,
    httpOptions: {
      extraBody: {
        imageSize: "3840x2160",      // 4K UHD resolution
        quality: "high",             // High quality
      },
    },
  },
});

for (const img of response.generatedImages) {
  console.log("4K image generated:", img.image.imageBytes.length, "bytes");
}

💡 Size Options

Preset sizes: 1024x1024 (square), 1536x1024 (landscape), 1024x1536 (portrait), auto

Custom sizes (only openai/gpt-image-2): any custom size is supported, subject to the following constraints:

  • Width and height must each be a multiple of 16
  • Maximum side length is 3840px
  • Aspect ratio must not exceed 3:1
  • Total pixels must fall between 655,360 and 8,294,400

Common custom sizes for reference:

SizeResolutionUse Case
1920x10801080pBlog covers, web banners
1080x19201080p portraitPhone wallpapers, social media stories
2560x14402K QHDDesktop wallpapers
3840x21604K UHDHigh-resolution posters, large displays

Different sizes affect generation time and token cost — larger sizes are more expensive. Sizes above 2560x1440 are experimental and may produce inconsistent results.

Specifying Output Format and Compression Quality

Use output_mime_type and output_compression_quality to control the output format:

Python
response = client.models.generate_images(
    model="openai/gpt-image-2",
    prompt="A minimalist logo design",
    config=types.GenerateImagesConfig(
        number_of_images=2,                 # Generate 2 images
        output_mime_type="image/webp",      # WebP format, smaller file size
        output_compression_quality=80,      # 80% compression quality
    )
)

for i, img in enumerate(response.generated_images):
    img.image.save(f"logo_{i}.webp")
ts
const response = await client.models.generateImages({
  model: "openai/gpt-image-2",
  prompt: "A minimalist logo design",
  config: {
    numberOfImages: 2,                    // Generate 2 images
    outputMimeType: "image/webp",         // WebP format, smaller file size
    outputCompressionQuality: 80,         // 80% compression quality
  },
});

💡 Output Formats

  • image/png: lossless format, ideal when high fidelity is needed (default)
  • image/webp: smaller file size, ideal for web display
  • image/jpeg: general-purpose lossy format; pair with output_compression_quality to control quality

Editing Images

To modify an existing image, use the edit_image API. The example below uses the ZenMux logo and transforms it into Chinese paper-cut style.

💡 Prompt Tips

The edit_image model is sensitive to the prompt. If the prompt is too vague (for example, "Transform this logo into..."), the model may freely generate from the style description and ignore the input image entirely. We recommend explicitly instructing the model to preserve the subject/composition of the original image and only change the style.

Python
import urllib.request

from google import genai
from google.genai import types

client = genai.Client(
    api_key="$ZENMUX_API_KEY",  # Replace with your API Key
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://zenmux.ai/api/vertex-ai'
    ),
)

# Load the image to edit (using the ZenMux logo as an example)
LOGO_URL = "https://cdn.marmot-cloud.com/storage/zenmux/2026/04/28/74mUf4t/Log-Light.png"
logo_bytes = urllib.request.urlopen(LOGO_URL).read()
original_image = types.Image(image_bytes=logo_bytes, mime_type="image/png")

# Edit the original image — restyle as paper-cut
edit_response = client.models.edit_image(
    model="openai/gpt-image-2",
    prompt=(
        "Keep the exact same subject, silhouette and pose from the input image. "
        "Re-render it in Chinese paper-cut art style: traditional red color, "
        "intricate hollow patterns, plain white background. "
        "Do not change the subject or composition; only restyle it as paper-cut."
    ),
    reference_images=[
        types.RawReferenceImage(
            reference_id=1,
            reference_image=original_image
        )
    ]
)

# Save the edited image
for i, img in enumerate(edit_response.generated_images):
    img.image.save(f"edited_{i}.png")
    print(f"Edited image saved as edited_{i}.png")
ts
const { GoogleGenAI, RawReferenceImage } = require("@google/genai");

const client = new GoogleGenAI({
  apiKey: "$ZENMUX_API_KEY", // Replace with your API Key
  vertexai: true,
  httpOptions: {
    baseUrl: "https://zenmux.ai/api/vertex-ai",
    apiVersion: "v1",
  },
});

// Load the image to edit (using the ZenMux logo as an example)
const LOGO_URL =
  "https://cdn.marmot-cloud.com/storage/zenmux/2026/04/28/74mUf4t/Log-Light.png";
const logoBytes = Buffer.from(await (await fetch(LOGO_URL)).arrayBuffer());
const originalImage = {
  imageBytes: logoBytes,
  mimeType: "image/png",
};

// Edit the original image — restyle as paper-cut
const editResponse = await client.models.editImage({
  model: "openai/gpt-image-2",
  prompt:
    "Keep the exact same subject, silhouette and pose from the input image. " +
    "Re-render it in Chinese paper-cut art style: traditional red color, " +
    "intricate hollow patterns, plain white background. " +
    "Do not change the subject or composition; only restyle it as paper-cut.",
  referenceImages: [
    new RawReferenceImage({
      referenceId: 1,
      referenceImage: originalImage,
    }),
  ],
});

console.log(editResponse);

Editing with a Mask

Use a mask to specify which region of the image should be edited — the transparent areas of the mask define the editable region:

Python
import urllib.request

from google import genai
from google.genai import types

client = genai.Client(
    api_key="$ZENMUX_API_KEY",
    vertexai=True,
    http_options=types.HttpOptions(
        api_version='v1',
        base_url='https://zenmux.ai/api/vertex-ai'
    ),
)

# Load the original image and mask (using the ZenMux Logo as an example)
LOGO_URL = "https://cdn.marmot-cloud.com/storage/zenmux/2026/04/28/74mUf4t/Log-Light.png"
logo_bytes = urllib.request.urlopen(LOGO_URL).read()

original_image = types.Image(image_bytes=logo_bytes, mime_type="image/png")
mask_image = types.Image(image_bytes=logo_bytes, mime_type="image/png")  # Transparent areas are the editable region

edit_response = client.models.edit_image(
    model="openai/gpt-image-2",
    prompt="Replace the background with a beach scene",
    reference_images=[
        types.RawReferenceImage(
            reference_id=1,
            reference_image=original_image
        ),
        types.MaskReferenceImage(
            reference_id=2,
            reference_image=mask_image,
            config=types.MaskReferenceConfig(
                mask_mode="MASK_MODE_USER_PROVIDED",
            )
        )
    ],
    config=types.EditImageConfig(
        number_of_images=1,
        output_mime_type="image/png",
        http_options=types.HttpOptions(
            extra_body={
                "imageSize": "1024x1024",
                "quality": "high",
            }
        ),
    )
)

for i, img in enumerate(edit_response.generated_images):
    img.image.save(f"masked_edit_{i}.png")
ts
const { GoogleGenAI, RawReferenceImage, MaskReferenceImage } = require("@google/genai");

const client = new GoogleGenAI({
  apiKey: "$ZENMUX_API_KEY",
  vertexai: true,
  httpOptions: {
    baseUrl: "https://zenmux.ai/api/vertex-ai",
    apiVersion: "v1",
  },
});

// Load the original image and mask (using the ZenMux Logo as an example)
const LOGO_URL =
  "https://cdn.marmot-cloud.com/storage/zenmux/2026/04/28/74mUf4t/Log-Light.png";
const logoBytes = Buffer.from(await (await fetch(LOGO_URL)).arrayBuffer());

const originalImage = {
  imageBytes: logoBytes,
  mimeType: "image/png",
};
const maskImage = {
  imageBytes: logoBytes, // Transparent areas are the editable region
  mimeType: "image/png",
};

const editResponse = await client.models.editImage({
  model: "openai/gpt-image-2",
  prompt: "Replace the background with a beach scene",
  referenceImages: [
    new RawReferenceImage({
      referenceId: 1,
      referenceImage: originalImage,
    }),
    new MaskReferenceImage({
      referenceId: 2,
      referenceImage: maskImage,
      config: {
        maskMode: "MASK_MODE_USER_PROVIDED",
      },
    }),
  ],
  config: {
    numberOfImages: 1,
    outputMimeType: "image/png",
    httpOptions: {
      extraBody: {
        imageSize: "1024x1024",
        quality: "high",
      },
    },
  },
});

console.log(editResponse);

💡 API Differences

  • Google Gemini models use the generate_content API with response_modalities: ["TEXT", "IMAGE"]; responses contain both text and images.
  • Non-Google models use the generate_images / edit_image APIs, which return image objects directly and support image editing.

Configuration

Required Parameters

  • api_key: Your ZenMux API key
  • vertexai: Must be set to true to enable the Vertex AI protocol
  • base_url: ZenMux Vertex AI endpoint https://zenmux.ai/api/vertex-ai
  • responseModalities: Response modalities; image generation must include ["TEXT", "IMAGE"]

Invocation Modes

ZenMux supports two invocation modes:

  • Streaming (generate_content_stream / generateContentStream): suitable for scenarios that need real-time feedback
  • Non-streaming (generate_content / generateContent): waits for the full response and returns it at once

⚠️ Response Handling

Image generation models may return both text and images in the same response. Iterate over response.parts to handle every content part.

Best Practices

  1. Prompt optimization: use clear, specific descriptions for better generation results
  2. Error handling: add exception handling to gracefully handle API failures
  3. Saving images: the Python SDK provides a convenient as_image() method to convert the response into a PIL Image object
  4. Model selection: choose the right model for your needs — free models are good for testing, paid models offer higher quality