Generate Images with Gemini
The Gemini image models are reachable through Google’s native generateContent route. WorldRouter forwards this surface 1:1, so existing Google AI Studio code keeps working unchanged.
Prefer the OpenAI /v1/images/generations shape (gpt-image-2, Grok Imagine
image)? See OpenAI-compatible image
generation.
What you can do
- Generate images with Gemini image models (
gemini-2.5-flash-image,gemini-3-pro-image-preview,gemini-3.1-flash-image-preview) throughPOST /v1beta/models/{model}:generateContent.
Before you start
The same API key and base URL work for every WorldRouter image surface. Only the request body and response shape differ.
This is not the OpenAI /v1/images/generations shape. Send a Gemini
contents body and read images out of
candidates[].content.parts[].inlineData.
Start from Quickstart if you still need an API key, the base URL, or a connection test.
Quick start flow
- 1Send a generateContent request to the Gemini image model you want to use.
- 2Save the JSON response.
- 3Extract the Base64 inlineData field and decode it into a real image file.
- 4Verify where the output file was saved, then open it locally.
Endpoint
POST https://inference-api-pre-d80ca3.worldrouter.ai/v1beta/models/%7Bmodel%7D:generateContentPer-model size config
Pass image-output preferences inside generationConfig.imageConfig. The supported keys differ by model:
| Model | Size config |
|---|---|
gemini-3-pro-image-preview | imageConfig.imageSize controls output resolution. |
gemini-3.1-flash-image-preview | imageConfig.imageSize accepts 0.5K, 1K, 2K, 4K. |
gemini-2.5-flash-image | imageConfig.aspectRatio only. Do not pass imageSize — output resolution follows the model’s default rule. |
responseModalities must include IMAGE for the model to return an image.
Examples
gemini-3.1-flash-image-preview — request a 2K square image:
curl -X POST "https://inference-api-pre-d80ca3.worldrouter.ai/v1beta/models/gemini-3.1-flash-image-preview:generateContent" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"role": "user",
"parts": [{
"text": "Create a clean 2K square image of a red five-point star centered on a white background."
}]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "1:1",
"imageSize": "2K"
}
}
}' \
-o gemini-31-image-response.jsongemini-2.5-flash-image — aspectRatio only, no imageSize:
curl -X POST "https://inference-api-pre-d80ca3.worldrouter.ai/v1beta/models/gemini-2.5-flash-image:generateContent" \
-H "Authorization: Bearer your_api_key" \
-H "Content-Type: application/json" \
-d '{
"contents": [{
"role": "user",
"parts": [{
"text": "Create a clean square image of a red five-point star centered on a white background."
}]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "1:1"
}
}
}' \
-o gemini-25-image-response.jsonThe two examples use different filenames to avoid overwriting earlier responses.
Response
Each generated image is delivered as base64-encoded inlineData inside a candidate.
If the response contains a very long unreadable string, that usually means you are looking at Base64-encoded image data, not garbage text.
Example response. Actual fields may vary slightly by model or upstream behavior.
{
"candidates": [
{
"content": {
"parts": [
{
"inlineData": {
"mimeType": "image/jpeg",
"data": "<base64 image>"
}
}
],
"role": "model"
},
"finishReason": "STOP",
"index": 0
}
],
"usageMetadata": {
"promptTokenCount": 22,
"candidatesTokenCount": 1203,
"totalTokenCount": 1326
},
"modelVersion": "gemini-3.1-flash-image-preview",
"responseId": "response_xxx"
}If your code already has a response object, and the first part is the image, decode it like this:
import base64
part = response["candidates"][0]["content"]["parts"][0]
with open("output.jpg", "wb") as f:
f.write(base64.b64decode(part["inlineData"]["data"]))Decode the saved Gemini response
The example below reads gemini-25-image-response.json. For gemini-3.1-flash-image-preview, change the filename to gemini-31-image-response.json.
import base64
import json
with open("gemini-25-image-response.json", "r", encoding="utf-8") as f:
payload = json.load(f)
parts = payload["candidates"][0]["content"]["parts"]
image_part = next((part for part in parts if "inlineData" in part), None)
if image_part is None:
raise ValueError("No image part with inlineData was found in the response.")
output_path = "output.jpg"
with open(output_path, "wb") as f:
f.write(base64.b64decode(image_part["inlineData"]["data"]))
print(f"Saved image to {output_path}")Default output location: current working directory. To save elsewhere, change output_path to an absolute path such as /tmp/output.jpg.
Verify:
ls -lh output.jpgmacOS: open output.jpg. Other systems: open the file with a local image viewer.
Image responses are billed by token, not by pixel. The usage /
usageMetadata block reports both text and image tokens — image tokens
dominate the bill on the Gemini preview models.
Error codes
Image-generation requests share the same error codes as the rest of the WorldRouter API. See Error codes in the API reference for the full table.
See also
- OpenAI-compatible image generation:
gpt-image-2and Grok Imagine image via/v1/images/generations. - Gemini image generation guide : upstream Google docs for the native
generateContentflow. - Seedance video guide: async video generation with
/api/v3/contents/generations/tasks. - API reference: chat-completions docs for text models.
- Models: full catalog with live pricing.
- Quickstart: API key, base URL, and first call.