Skip to main content

Gemini Image Generation Migration Guide

Who is impacted by this change?โ€‹

Anyone using the following models with /chat/completions:

  • gemini/gemini-2.0-flash-exp-image-generation
  • vertex_ai/gemini-2.0-flash-exp-image-generation

Key Changeโ€‹

info

From v1.77.0, LiteLLM will return the List of images in response.choices[0].message.images instead of a single image in response.choices[0].message.image.

Gemini models now support image generation through chat completions. Images are returned in response.choices[0].message.images with base64 data URLs.

Before and Afterโ€‹

Beforeโ€‹

from litellm import completion

response = completion(
model="gemini/gemini-2.0-flash-exp-image-generation",
messages=[{"role": "user", "content": "Generate an image of a cat"}],
modalities=["image", "text"],
)


base_64_image_data = response.choices[0].message.content

Afterโ€‹

from litellm import completion

response = completion(
model="gemini/gemini-2.0-flash-exp-image-generation",
messages=[{"role": "user", "content": "Generate an image of a cat"}],
modalities=["image", "text"],
)

# Image is now available in the response
image_url = response.choices[0].message.images[0]["image_url"]["url"] # "data:image/png;base64,..."

Why the change?โ€‹

Because the newer gemini-2.5-flash-image-preview model sends both text and image responses in the same response. This interface allows a developer to explicitly access the image or text components of the response. Before a developer would have needed to search through the message content to find the image generated by the model.

Why the change from image to images? This is to be consistent with the OpenRouter API, making sure we are using simple, well-known interfaces where possible.

Usageโ€‹

Using the Python SDKโ€‹

Key Change:

# Before
-- base_64_image_data = response.choices[0].message.content

# After
++ image_url = response.choices[0].message.images[0]["image_url"]["url"]

Basic Image Generationโ€‹

from litellm import completion
import os

# Set your API key
os.environ["GEMINI_API_KEY"] = "your-api-key"

# Generate an image
response = completion(
model="gemini/gemini-2.0-flash-exp-image-generation",
messages=[{"role": "user", "content": "Generate an image of a cat"}],
modalities=["image", "text"],
)

# Access the generated image
print(response.choices[0].message.content) # Text response (if any)
print(response.choices[0].message.images[0]) # Image data

Response Formatโ€‹

The image is returned in the message.images field:

{
"image_url": {
"url": "...",
"detail": "auto"
},
"index": 0,
"type": "image_url"
}

Using the LiteLLM Proxy Serverโ€‹

Key Change:

# Before
-- "content": "base64-image-data..."

# After
++ "images": [{
++ "image_url": {
++ "url": "...",
++ "detail": "auto"
++ },
++ "index": 0,
++ "type": "image_url"
++ }]

Configuration Setupโ€‹

  1. Configure your models in config.yaml:
model_list:
- model_name: gemini-image-gen
litellm_params:
model: gemini/gemini-2.0-flash-exp-image-generation
api_key: os.environ/GEMINI_API_KEY
- model_name: vertex-image-gen
litellm_params:
model: vertex_ai/gemini-2.5-flash-image-preview
vertex_project: your-project-id
vertex_location: us-central1

general_settings:
master_key: sk-1234 # Your proxy API key
  1. Start the proxy server:
litellm --config /path/to/config.yaml

# RUNNING on http://0.0.0.0:4000

Making Requestsโ€‹

Using OpenAI SDK:

from openai import OpenAI

# Point to your proxy server
client = OpenAI(
api_key="sk-1234", # Your proxy API key
base_url="http://0.0.0.0:4000"
)

response = client.chat.completions.create(
model="gemini-image-gen",
messages=[{"role": "user", "content": "Generate an image of a cat"}],
extra_body={"modalities": ["image", "text"]}
)

# Access the generated image
print(response.choices[0].message.content) # Text response (if any)
print(response.choices[0].message.image) # Image data

Using curl:

curl -X POST 'http://0.0.0.0:4000/v1/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "gemini-image-gen",
"messages": [
{
"role": "user",
"content": "Generate an image of a cat"
}
],
"modalities": ["image", "text"]
}'

Response format from proxy:

{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1704089632,
"model": "gemini-image-gen",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Here's an image of a cat for you!",
"images": [{
"url": "...",
"detail": "auto"
}
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}