WorldRouter

Skip to Content
Media GenerationImagesGemini multimodal

Generate Images with Gemini

The Gemini image models are reachable through Google’s native generateContent route. WorldRouter forwards this surface 1:1, so existing Google AI Studio code keeps working unchanged.

Tip:

Prefer the OpenAI /v1/images/generations shape (gpt-image-2, Grok Imagine image)? See OpenAI-compatible image generation.

What you can do

  • Generate images with Gemini image models (gemini-2.5-flash-image, gemini-3-pro-image-preview, gemini-3.1-flash-image-preview) through POST /v1beta/models/{model}:generateContent.

Before you start

The same API key and base URL work for every WorldRouter image surface. Only the request body and response shape differ.

Warning:

This is not the OpenAI /v1/images/generations shape. Send a Gemini contents body and read images out of candidates[].content.parts[].inlineData.

Tip:

Start from Quickstart if you still need an API key, the base URL, or a connection test.

Quick start flow

  1. 1Send a generateContent request to the Gemini image model you want to use.
  2. 2Save the JSON response.
  3. 3Extract the Base64 inlineData field and decode it into a real image file.
  4. 4Verify where the output file was saved, then open it locally.

Endpoint

Endpoint
POST https://inference-api-pre-d80ca3.worldrouter.ai/v1beta/models/%7Bmodel%7D:generateContent

Per-model size config

Pass image-output preferences inside generationConfig.imageConfig. The supported keys differ by model:

ModelSize config
gemini-3-pro-image-previewimageConfig.imageSize controls output resolution.
gemini-3.1-flash-image-previewimageConfig.imageSize accepts 0.5K, 1K, 2K, 4K.
gemini-2.5-flash-imageimageConfig.aspectRatio only. Do not pass imageSize — output resolution follows the model’s default rule.

responseModalities must include IMAGE for the model to return an image.

Examples

gemini-3.1-flash-image-preview — request a 2K square image:

curl
curl -X POST "https://inference-api-pre-d80ca3.worldrouter.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "Create a clean 2K square image of a red five-point star centered on a white background."
    }]
  }],
  "generationConfig": {
    "responseModalities": ["TEXT", "IMAGE"],
    "imageConfig": {
      "aspectRatio": "1:1",
      "imageSize": "2K"
    }
  }
}' \
-o gemini-31-image-response.json

gemini-2.5-flash-imageaspectRatio only, no imageSize:

curl
curl -X POST "https://inference-api-pre-d80ca3.worldrouter.ai/v1beta/models/gemini-2.5-flash-image:generateContent" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "Create a clean square image of a red five-point star centered on a white background."
    }]
  }],
  "generationConfig": {
    "responseModalities": ["TEXT", "IMAGE"],
    "imageConfig": {
      "aspectRatio": "1:1"
    }
  }
}' \
-o gemini-25-image-response.json

The two examples use different filenames to avoid overwriting earlier responses.

Response

Each generated image is delivered as base64-encoded inlineData inside a candidate.

If the response contains a very long unreadable string, that usually means you are looking at Base64-encoded image data, not garbage text.

Example response. Actual fields may vary slightly by model or upstream behavior.

200 OK
{
"candidates": [
  {
    "content": {
      "parts": [
        {
          "inlineData": {
            "mimeType": "image/jpeg",
            "data": "<base64 image>"
          }
        }
      ],
      "role": "model"
    },
    "finishReason": "STOP",
    "index": 0
  }
],
"usageMetadata": {
  "promptTokenCount": 22,
  "candidatesTokenCount": 1203,
  "totalTokenCount": 1326
},
"modelVersion": "gemini-3.1-flash-image-preview",
"responseId": "response_xxx"
}

If your code already has a response object, and the first part is the image, decode it like this:

python
import base64

part = response["candidates"][0]["content"]["parts"][0]
with open("output.jpg", "wb") as f:
    f.write(base64.b64decode(part["inlineData"]["data"]))

Decode the saved Gemini response

The example below reads gemini-25-image-response.json. For gemini-3.1-flash-image-preview, change the filename to gemini-31-image-response.json.

python
import base64
import json

with open("gemini-25-image-response.json", "r", encoding="utf-8") as f:
    payload = json.load(f)

parts = payload["candidates"][0]["content"]["parts"]
image_part = next((part for part in parts if "inlineData" in part), None)

if image_part is None:
    raise ValueError("No image part with inlineData was found in the response.")

output_path = "output.jpg"

with open(output_path, "wb") as f:
    f.write(base64.b64decode(image_part["inlineData"]["data"]))

print(f"Saved image to {output_path}")

Default output location: current working directory. To save elsewhere, change output_path to an absolute path such as /tmp/output.jpg.

Verify:

bash
ls -lh output.jpg

macOS: open output.jpg. Other systems: open the file with a local image viewer.

Tip:

Image responses are billed by token, not by pixel. The usage / usageMetadata block reports both text and image tokens — image tokens dominate the bill on the Gemini preview models.

Error codes

Image-generation requests share the same error codes as the rest of the WorldRouter API. See Error codes in the API reference for the full table.

See also

Last updated on