Image Analysis & Generation

The RouteLLM API provides full image support — both analyzing images as input (vision) and generating images as output — all via the same unified /v1/chat/completions endpoint.

Image Analysis

Send images alongside text to any vision-capable model for description, classification, OCR, comparison, and more. Images can be provided as an HTTPS URL or base64-encoded data.

Supported Input Formats

PNG, JPEG, WebP, GIF
Images are automatically resized and processed by the API
Multiple images can be included in a single message

Providing Images as Input

Image via HTTPS URL
Image via Base64

{
  "model": "route-llm",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "Describe the image" },
        {
          "type": "image_url",
          "image_url": {
            "url": "https://example.com/image.jpg"
          }
        }
      ]
    }
  ]
}

{
  "model": "route-llm",
  "messages": [
    {
      "role": "user",
      "content": [
        { "type": "text", "text": "What is in this image?" },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA..."
          }
        }
      ]
    }
  ]
}

Note: Base64 images must use the data URI format: data:image/<format>;base64,<base64_string>

Image Generation

The RouteLLM API supports image generation from text prompts using a wide range of state-of-the-art image generation models. Image generation uses the same unified chat completions endpoint as text generation, with additional modalities and image_config parameters.

Supported Models

FLUX
OpenAI
Google
Others
Multimodal LLMs

Model	Description
`flux_kontext`	Context-aware image generation
`flux_kontext_edit`	Context-aware image editing
`flux_pro_ultra`	Highest quality FLUX generation
`flux_pro`	Professional image generation
`flux_pro_canny`	Edge-guided image generation
`flux_pro_depth`	Depth-guided image generation
`flux2_pro`	High-quality, photorealistic image generation
`flux2`	Image generation

Model	Description
`gpt_image15`	Enhanced image generation
`gpt_image2`	Next-generation image generation
`gpt_image_edit`	AI-powered image editing
`gpt_image2_edit`	Next-generation image editing
`dalle`	High-quality creative image generation

Model	Description
`imagen`	Photorealistic image generation
`nano_banana_pro`	High-quality image generation (Google DeepMind)
`nano_banana2`	Image generation (Google DeepMind)
`nano_banana`	Image generation (Google DeepMind)

Model	Description
`ideogram`	Excellent for text rendering in images
`ideogram_character`	Character-focused image generation
`recraft`	Design and illustration focused
`recraft_svg`	Vector image generation
`midjourney`	Artistic image generation
`seedream`	Image generation
`dreamina`	Image generation
`magnific`	AI image upscaling and enhancement
`qwen_image_edit`	AI-powered image editing
`imagine_art`	Artistic image generation
`hunyuan_image`	Image generation
`grok_imagine_image`	Image generation by xAI

In addition to dedicated image generation models, the following multimodal LLMs support image generation when modalities: ["image"] is specified:

Provider	Models	Notes
OpenAI	`gpt-5.4`, `gpt-5.4-mini`, `gpt-5`, `gpt-4o`, and other GPT multimodal models	Uses the Responses API internally
Google	`gemini-3.1-pro`, `gemini-3-flash`, `gemini-2.5-pro`, `gemini-2.5-flash`	Routed to a Nano Banana model (see below)

Google Gemini → Image Model Routing

When a Gemini model is used with modalities: ["image"], the request is automatically routed to the appropriate Nano Banana model:

Gemini Model	Routed To
`gemini-3.*-pro` (e.g. `gemini-3-pro`, `gemini-3.1-pro`)	`nano_banana_pro`
`gemini-3.*-flash` (e.g. `gemini-3-flash`)	`nano_banana2`
All other Gemini models (e.g. `gemini-2.5-pro`, `gemini-2.5-flash`)	`nano_banana`

Request Parameters

Image generation uses the same /v1/chat/completions endpoint as text generation.

`model` (string, required)

The ID of the model to use. Can be any supported image generation model or a Gemini/OpenAI multimodal LLM.

`messages` (array, required)

The conversation messages. The user's message should contain the image generation prompt.

{
  "role": "user",
  "content": "A beautiful sunset over mountains"
}

`modalities` (array, required for image generation)

Must be set to ["image"] to generate images.

"modalities": ["image"]

`image_config` (object, optional)

Configuration for image generation. Parameters vary by model.

Common Parameters

The following parameters are supported by most image generation models:

Parameter	Type	Required	Description	Valid Values	Default
`prompt`	string	Yes	Describe the scene or action to generate	Any text	—
`num_images`	integer	No	Number of images to generate	`1`–`4`	`1`
`rewrite_prompt`	boolean	No	Automatically improve the prompt for better results	`true`, `false`	`true`

Note: num_images is not supported by flux_kontext_edit, gpt_image15_edit, imagine_art, magnific, and midjourney.

Model-Specific Parameters

FLUX
OpenAI
Google
Others

Model ID: flux_pro

Parameter	Type	Required	Description	Valid Values	Default
`size`	string	No	Resolution of the generated image	`1440x1440`, `1440x1024`, `1024x1440`, `1440x768`, `768x1440`, `1024x1024`, `1024x768`, `768x1024`, `768x576`, `576x768`, `640x640`, `768x448`, `448x768`, `640x480`, `480x640`	`1440x1440`
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: flux_pro_ultra

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `21:9`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `9:21`	`1:1`
`raw`	boolean	No	Generate less processed, higher quality images	`true`, `false`	`false`
`image_prompt`	image	No	Reference image to influence the output	Image file (max 1)	—
`image_prompt_strength`	number	No	How closely the output resembles the reference image	`0`–`1` (step `0.01`)	`0.1`
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: flux_pro_canny, flux_pro_depth

Parameter	Type	Required	Description	Valid Values	Default
`control_image`	image	Yes	Input image used as structural guidance	Image file	—
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: flux_kontext

Parameter	Type	Required	Description	Valid Values	Default
`mode`	string	Yes	Generation quality mode	`pro`, `max`	`pro`
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `21:9`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `9:21`	`1:1`
`image_prompt`	image	No	Upload up to 4 images to edit	Image files (max 4)	—
`seed`	number	No	Seed for reproducible results	Any integer	—
`guidance_scale`	number	No	How closely the output matches the prompt	`1`–`20` (step `0.5`)	`3.5`

Note: When image_prompt is provided, num_images, seed, and guidance_scale are ignored.

Model ID: flux_kontext_edit

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	Yes	Upload up to 4 images to edit	Image files (max 4)	—
`prompt`	string	Yes	Describe the edits to make	Any text	—
`mode`	string	Yes	Generation quality mode	`pro`, `max`	`pro`
`aspect_ratio`	string	No	Aspect ratio of the output image	`1:1`, `21:9`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `9:21`	`1:1`
`seed`	number	No	Seed for reproducible results	Any integer	—
`guidance_scale`	number	No	How closely the output matches the prompt	`1`–`20` (step `0.5`)	`3.5`

Model ID: flux2

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Aspect ratio of the generated image	`square_hd` (1:1 HD), `square` (1:1), `portrait_4_3` (3:4), `portrait_16_9` (9:16), `landscape_4_3` (4:3), `landscape_16_9` (16:9)	`square_hd`
`image_prompt`	image	No	Upload up to 3 reference images	Image files (max 3)	—
`acceleration`	string	No	Speed vs. quality trade-off	`none`, `regular`, `high`	`none`
`seed`	number	No	Seed for reproducible results	Any integer	—
`guidance_scale`	number	No	How closely the output matches the prompt	`0`–`20` (step `0.5`)	`2.5`

Model ID: flux2_pro

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Aspect ratio of the generated image	`square_hd` (1:1 HD), `square` (1:1), `portrait_4_3` (3:4), `portrait_16_9` (9:16), `landscape_4_3` (4:3), `landscape_16_9` (16:9)	`square_hd`
`image_prompt`	image	No	Upload up to 3 reference images	Image files (max 3)	—
`acceleration`	string	No	Speed vs. quality trade-off	`none`, `regular`, `high`	`none`
`seed`	number	No	Seed for reproducible results	Any integer	—
`guidance_scale`	number	No	How closely the output matches the prompt	`0`–`20` (step `0.5`)	`2.5`

DALL-E
GPT Image
GPT Image Edit

Model ID: dalle

Parameter	Type	Required	Description	Valid Values	Default
`size`	string	No	Dimensions of the generated image	`1024x1024`, `1792x1024`, `1024x1792`	`1024x1024`
`quality`	string	No	Image quality level	`standard`, `hd`	`standard`
`style`	string	No	Visual style — `vivid` for hyper-real, `natural` for realistic	`vivid`, `natural`	`vivid`

Model ID: gpt_image15, gpt_image2

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	No	Upload up to 5 images for inpainting/editing	Image files (max 5)	—
`quality`	string	No	Output image quality	`auto`, `low`, `medium`, `high`	`auto`
`size`	string	No	Dimensions of the output image	`1024x1024`, `1024x1536`, `1536x1024`	`1024x1024`

Note: When image_prompt is provided, the model switches to edit/inpainting mode and num_images, quality, and size are ignored.

Model ID: gpt_image_edit, gpt_image2_edit

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	Yes	Upload up to 5 images to edit	Image files (max 5)	—
`prompt`	string	Yes	Describe the edits to make	Any text	—
`size`	string	No	Dimensions of the output image	`1024x1024`, `1024x1536`, `1536x1024`	`1024x1024`

Imagen
Nano Banana
Nano Banana 2
Nano Banana Pro

Model ID: imagen

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `16:9`, `4:3`, `3:4`, `9:16`	`1:1`
`negative_prompt`	string	No	Elements to exclude from the image (e.g., `"blur, distortion"`)	Any text	—
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: nano_banana

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	No	Upload up to 4 reference images	Image files (max 4)	—
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `9:16`, `16:9`, `3:4`, `4:3`, `3:2`, `2:3`	`1:1`

Model ID: nano_banana2

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	No	Upload up to 14 reference images (up to 10 objects + 4 characters)	Image files (max 14)	—
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `1:4`, `1:8`, `2:3`, `3:2`, `3:4`, `4:1`, `4:3`, `4:5`, `5:4`, `8:1`, `9:16`, `16:9`, `21:9`	`1:1`
`resolution`	string	No	Output resolution	`0.5K`, `1K`, `2K`, `4K`	`1K`

Note: 4K resolution costs double the credits of 1K and 2K.

Model ID: nano_banana_pro

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	No	Upload up to 4 reference images	Image files (max 4)	—
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `21:9`, `16:9`, `3:2`, `4:3`, `5:4`, `4:5`, `3:4`, `2:3`, `9:16`	`1:1`
`resolution`	string	No	Output resolution	`1K`, `2K`, `4K`	`1K`

Note: 4K resolution costs double the credits of 1K and 2K.

Model ID: ideogram

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1x1`, `2x3`, `3x2`, `3x4`, `4x3`, `4x5`, `5x4`, `9x16`, `16x9`, `10x16`, `16x10`, `1x2`, `2x1`, `1x3`, `3x1`	`1x1`
`style`	string	No	Visual style of the output	`AUTO`, `GENERAL`, `REALISTIC`, `DESIGN`	`AUTO`
`negative_prompt`	string	No	Elements to exclude from the image	Any text	—
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: ideogram_character

Parameter	Type	Required	Description	Valid Values	Default
`character_images`	image	Yes	Character reference images	Image files (max 4)	—
`style_images`	image	No	Style reference images	Image files (max 4)	—
`negative_prompt`	string	No	Elements to exclude from the image	Any text	—
`rendering_speed`	string	No	Speed vs. quality trade-off	`TURBO`, `BALANCED`, `QUALITY`	`BALANCED`
`style`	string	No	Visual style of the output	`AUTO`, `REALISTIC`, `FICTION`	`AUTO`
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1 HD`, `1:1`, `3:4`, `9:16`, `4:3`, `16:9`	`1:1 HD`
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: recraft

Parameter	Type	Required	Description	Valid Values	Default
`size`	string	No	Dimensions of the generated image	`1024x1024`, `1365x1024`, `1024x1365`, `1536x1024`, `1024x1536`, `1820x1024`, `1024x1820`, `1024x2048`, `2048x1024`, `1434x1024`, `1024x1434`, `1024x1280`, `1280x1024`, `1024x1707`, `1707x1024`	`1024x1024`
`style`	string	No	Visual style — `Realistic` for photorealism, `Digital Illustration` for digital art	`realistic_image`, `digital_illustration`	`realistic_image`

Model ID: recraft_svg

Parameter	Type	Required	Description	Valid Values	Default
`size`	string	No	Dimensions of the generated image	`1024x1024`, `1365x1024`, `1024x1365`, `1536x1024`, `1024x1536`, `1820x1024`, `1024x1820`, `1024x2048`, `2048x1024`, `1434x1024`, `1024x1434`, `1024x1280`, `1280x1024`, `1024x1707`, `1707x1024`	`1024x1024`
`style`	string	No	SVG illustration style	`any`, `engraving`, `line_art`, `line_circuit`, `linocut`	`any`

Model ID: midjourney

Parameter	Type	Required	Description	Valid Values	Default
`version`	string	No	Midjourney model version	`v7`, `v6`, `niji 6`	`v7`
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `1:2`, `2:1`	`1:1`
`process_mode`	string	No	Generation speed	`fast`, `turbo`, `relax`	`fast`
`negative_prompt`	string	No	Elements to exclude from the image	Any text	—

Warning: Do not include URLs or -- flags in prompt or negative_prompt. The API automatically appends the appropriate flags based on the selected settings.

Model ID: seedream

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	No	Upload up to 4 reference images	Image files (max 4)	—
`image_size`	string	No	Output image size	`auto_4K` (Auto), `square_hd` (1:1 HD), `square` (1:1), `portrait_4_3` (3:4), `landscape_4_3` (4:3), `portrait_16_9` (9:16), `landscape_16_9` (16:9)	`auto_4K`
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: dreamina

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Output aspect ratio	`square_hd` (1:1 HD), `square` (1:1), `landscape_4_3` (4:3), `landscape_16_9` (16:9)	`square_hd`
`seed`	number	No	Seed for reproducible results	Any integer	—

Model ID: magnific

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	Yes	Image to upscale	Image file	—
`prompt`	string	Yes	Prompt to guide the upscaling process	Any text	—
`scale_factor`	string	No	Upscale multiplier	`2x`, `4x`, `8x`	`2x`
`optimized_for`	string	No	Content type to optimize for	`standard`, `soft_portraits`, `hard_portraits`, `art_n_illustration`, `videogame_assets`, `nature_n_landscapes`, `films_n_photography`, `3d_renders`, `science_fiction_n_horror`	`standard`
`creativity`	number	No	AI creativity level	`-10`–`10`	`0`
`hdr`	number	No	Image detail and definition	`-10`–`10`	`0`
`resemblance`	number	No	Similarity to the original image	`-10`–`10`	`0`
`fractality`	number	No	Fractal detail level	`-10`–`10`	`0`
`engine`	string	No	Upscaling engine to use	`automatic`, `magnific_illusio`, `magnific_sharpy`, `magnific_sparkle`	`automatic`

Model ID: qwen_image_edit

Parameter	Type	Required	Description	Valid Values	Default
`image_prompt`	image	Yes	Image to edit (max 1 file)	Image file	—
`negative_prompt`	string	No	Elements to exclude from the image	Any text	—
`aspect_ratio`	string	No	Output aspect ratio	`square_hd` (1:1 HD), `square` (1:1), `portrait_4_3` (3:4), `portrait_16_9` (9:16), `landscape_4_3` (4:3), `landscape_16_9` (16:9)	`square_hd`
`seed`	number	No	Seed for reproducible results	Any integer	—
`guidance_scale`	number	No	Prompt adherence strength	`1`–`20` (step `0.5`)	`4`
`num_inference_steps`	number	No	Number of diffusion steps (higher = more detail)	`2`–`50`	`30`
`acceleration`	string	No	Speed vs. quality trade-off	`regular`, `high`	`regular`

Model ID: hunyuan_image

Parameter	Type	Required	Description	Valid Values	Default
`negative_prompt`	string	No	Elements to exclude from the image	Any text	—
`image_size`	string	No	Output image size	`square_hd` (1:1 HD), `square` (1:1), `portrait_4_3` (3:4), `portrait_16_9` (9:16), `landscape_4_3` (4:3), `landscape_16_9` (16:9)	`square_hd`
`guidance_scale`	number	No	Prompt adherence strength	`1`–`10` (step `0.5`)	`7.5`

Model ID: imagine_art

Parameter	Type	Required	Description	Valid Values	Default
`aspect_ratio`	string	No	Aspect ratio of the generated image	`1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3`, `1:3`, `3:1`	`1:1`

Response Schema

Image generation responses follow the same unified chat completion format. When modalities includes "image", the response contains image URLs in the images field of the message.

{
  "created": 1677858242,
  "model": "gemini-2.5-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "",
        "images": [
          {
            "type": "image_url",
            "image_url": {
              "url": "https://example.com/generated-image-1.png"
            }
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "https://example.com/generated-image-2.png"
            }
          }
        ]
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "compute_points_used": 150
  }
}

Code Examples

1. Basic Image Generation

Python SDK
TypeScript/JavaScript
cURL

from openai import OpenAI

client = OpenAI(
    base_url="<your base url>",
    api_key="<your_api_key>",
)

response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {
            "role": "user",
            "content": "A beautiful sunset over mountains"
        }
    ],
    modalities=["image"],
    image_config={
        "num_images": 1
    }
)

for image in response.choices[0].message.images:
    if image['type'] == "image_url":
        print(f"Generated image: {image['image_url']['url']}")

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: '<your base url>',
  apiKey: '<your_api_key>',
});

const response = await openai.chat.completions.create({
  model: 'flux2_pro',  // use GET /v1/models for exact model IDs
  messages: [
    {
      role: 'user',
      content: 'A beautiful sunset over mountains'
    }
  ],
  modalities: ['image'],
  image_config: {
    num_images: 1
  }
});

(response.choices[0].message as any).images?.forEach((item: any) => {
  if (item.type === 'image_url') {
    console.log('Image URL:', item.image_url.url);
  }
});

curl -X POST "<your base url>/chat/completions" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux2_pro",
    "messages": [
      {
        "role": "user",
        "content": "A beautiful sunset over mountains"
      }
    ],
    "modalities": ["image"],
    "image_config": {
      "num_images": 1
    }
  }'

2. Multiple Images

Python SDK
TypeScript/JavaScript
cURL

from openai import OpenAI

client = OpenAI(
    base_url="<your base url>",
    api_key="<your_api_key>",
)

response = client.chat.completions.create(
    model="flux2_pro",  # use GET /v1/models for exact model IDs
    messages=[
        {
            "role": "user",
            "content": "A futuristic cityscape at night with neon lights and flying cars"
        }
    ],
    modalities=["image"],
    image_config={
        "num_images": 3,
        "aspect_ratio": "1:1"
    }
)

image_urls = [
    item['image_url']['url']
    for item in response.choices[0].message.images
    if item['type'] == "image_url"
]
for idx, url in enumerate(image_urls, 1):
    print(f"Image {idx}: {url}")

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: '<your base url>',
  apiKey: '<your_api_key>',
});

const response = await openai.chat.completions.create({
  model: 'flux2_pro',
  messages: [
    {
      role: 'user',
      content: 'A futuristic cityscape at night with neon lights and flying cars'
    }
  ],
  modalities: ['image'],
  image_config: {
    num_images: 3,
    aspect_ratio: '1:1'
  }
});

(response.choices[0].message as any).images?.forEach((item: any) => {
  if (item.type === 'image_url') {
    console.log('Image URL:', item.image_url.url);
  }
});

curl -X POST "<your base url>/chat/completions" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux2_pro",
    "messages": [
      {
        "role": "user",
        "content": "A futuristic cityscape at night with neon lights and flying cars"
      }
    ],
    "modalities": ["image"],
    "image_config": {
      "num_images": 3,
      "aspect_ratio": "1:1"
    }
  }'

3. Portrait Orientation

Python SDK
TypeScript/JavaScript
cURL

from openai import OpenAI

client = OpenAI(
    base_url="<your base url>",
    api_key="<your_api_key>",
)

response = client.chat.completions.create(
    model="flux2_pro",
    messages=[
        {
            "role": "user",
            "content": "A full-body portrait of a fashion model in elegant evening wear"
        }
    ],
    modalities=["image"],
    image_config={
        "num_images": 1,
        "aspect_ratio": "2:3"
    }
)

for image in response.choices[0].message.images:
    if image['type'] == "image_url":
        print(f"Portrait image: {image['image_url']['url']}")

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: '<your base url>',
  apiKey: '<your_api_key>',
});

const response = await openai.chat.completions.create({
  model: 'flux2_pro',
  messages: [
    {
      role: 'user',
      content: 'A full-body portrait of a fashion model in elegant evening wear'
    }
  ],
  modalities: ['image'],
  image_config: {
    num_images: 1,
    aspect_ratio: '2:3'
  }
});

(response.choices[0].message as any).images?.forEach((item: any) => {
  if (item.type === 'image_url') {
    console.log('Portrait image:', item.image_url.url);
  }
});

curl -X POST "<your base url>/chat/completions" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "flux2_pro",
    "messages": [
      {
        "role": "user",
        "content": "A full-body portrait of a fashion model in elegant evening wear"
      }
    ],
    "modalities": ["image"],
    "image_config": {
      "num_images": 1,
      "aspect_ratio": "2:3"
    }
  }'

4. OpenAI Model with Quality Control

Python SDK
TypeScript/JavaScript
cURL

from openai import OpenAI

client = OpenAI(
    base_url="<your base url>",
    api_key="<your_api_key>",
)

response = client.chat.completions.create(
    model="gpt-5.1",
    messages=[
        {
            "role": "user",
            "content": "A whimsical illustration of a magical forest with glowing mushrooms"
        }
    ],
    modalities=["image"],
    image_config={
        "num_images": 1,
        "aspect_ratio": "1:1",
        "quality": "high"
    }
)

for image in response.choices[0].message.images:
    if image['type'] == "image_url":
        print(f"Image URL: {image['image_url']['url']}")

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: '<your base url>',
  apiKey: '<your_api_key>',
});

const response = await openai.chat.completions.create({
  model: 'gpt-5.1',
  messages: [
    {
      role: 'user',
      content: 'A whimsical illustration of a magical forest with glowing mushrooms'
    }
  ],
  modalities: ['image'],
  image_config: {
    num_images: 1,
    aspect_ratio: '1:1',
    quality: 'high'
  }
});

(response.choices[0].message as any).images?.forEach((item: any) => {
  if (item.type === 'image_url') {
    console.log('Image URL:', item.image_url.url);
  }
});

curl -X POST "<your base url>/chat/completions" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.1",
    "messages": [
      {
        "role": "user",
        "content": "A whimsical illustration of a magical forest with glowing mushrooms"
      }
    ],
    "modalities": ["image"],
    "image_config": {
      "num_images": 1,
      "aspect_ratio": "1:1",
      "quality": "high"
    }
  }'

5. Gemini Model with Image Size and Resolution

Python SDK
TypeScript/JavaScript
cURL

from openai import OpenAI

client = OpenAI(
    base_url="<your base url>",
    api_key="<your_api_key>",
)

response = client.chat.completions.create(
    model="nano_banana2",
    messages=[
        {
            "role": "user",
            "content": "A professional headshot of a business executive"
        }
    ],
    modalities=["image"],
    image_config={
        "num_images": 1,
        "aspect_ratio": "2:3",
        "resolution": "2K"
    }
)

for image in response.choices[0].message.images:
    if image['type'] == "image_url":
        print(f"Image URL: {image['image_url']['url']}")

import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: '<your base url>',
  apiKey: '<your_api_key>',
});

const response = await openai.chat.completions.create({
  model: 'nano_banana2',
  messages: [
    {
      role: 'user',
      content: 'A professional headshot of a business executive'
    }
  ],
  modalities: ['image'],
  image_config: {
    num_images: 1,
    aspect_ratio: '2:3',
    resolution: '2K'
  }
});

(response.choices[0].message as any).images?.forEach((item: any) => {
  if (item.type === 'image_url') {
    console.log('Image URL:', item.image_url.url);
  }
});

curl -X POST "<your base url>/chat/completions" \
  -H "Authorization: Bearer <your_api_key>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "nano_banana2",
    "messages": [
      {
        "role": "user",
        "content": "A professional headshot of a business executive"
      }
    ],
    "modalities": ["image"],
    "image_config": {
      "num_images": 1,
      "aspect_ratio": "2:3",
      "resolution": "2K"
    }
  }'

Image Analysis​

Supported Input Formats​

Providing Images as Input​

Image Generation​

Supported Models​

Google Gemini → Image Model Routing​

Request Parameters​

model (string, required)​

messages (array, required)​

modalities (array, required for image generation)​

image_config (object, optional)​

Common Parameters​

Model-Specific Parameters​

Response Schema​

Code Examples​

1. Basic Image Generation​

2. Multiple Images​

3. Portrait Orientation​

4. OpenAI Model with Quality Control​

5. Gemini Model with Image Size and Resolution​