Gemini

Gemini is a family of multimodal AI models by Google that can understand and generate text, images, audio, video, and code. See their model options here. Gemini stands out with native multimodal understanding across images, video, and audio, built-in Google Search for real-time information, File Search for RAG over your documents, native image generation and editing, text-to-speech synthesis, and advanced reasoning with thinking models.

Model Recommendations

Model	Best For	Key Strengths
`gemini-2.0-flash`	Most use-cases	Balanced speed and intelligence
`gemini-2.0-flash-lite`	High-volume tasks	Most cost-effective
`gemini-2.5-pro`	Complex tasks	Advanced reasoning, largest context
`gemini-3-pro-preview`	Latest features	Thought signatures support

Google has rate limits on their APIs. See the docs for more information.

Installation

pip install google-genai agno

Authentication

There are two ways to use the Gemini class: via Google AI Studio (using GOOGLE_API_KEY) or via Vertex AI (using Google Cloud credentials).

Google AI Studio

Set the GOOGLE_API_KEY environment variable. You can get one from Google AI Studio.

export GOOGLE_API_KEY=***

Vertex AI

To use Vertex AI in Google Cloud:

Refer to the Vertex AI documentation to set up a project and development environment.
Install the gcloud CLI and authenticate (refer to the quickstart for more details):

gcloud auth application-default login

Enable Vertex AI API and set the project ID environment variable (alternatively, you can set project_id in the Agent config):

Export the following variables:

export GOOGLE_GENAI_USE_VERTEXAI="true"
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"

Or configure directly in your agent:

from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(
        id="gemini-2.0-flash",
        vertexai=True,
        project_id="your-project-id",
        location="us-central1",
    ),
)

Example

Use Gemini with your Agent:

from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    markdown=True,
)

# Print the response in the terminal
agent.print_response("Share a 2 sentence horror story.")

View more examples here.

Capabilities

Multimodal Input

Images, video, audio, PDFs

Image Generation

Generate and edit images

Grounding & Search

Real-time web grounding

File Search

Native RAG over documents

Speech Generation

Audio output responses

Thinking Models

Advanced reasoning

Multimodal Input

Gemini natively understands images, video, audio, and documents. See Google’s vision documentation for supported formats and limits.

from agno.agent import Agent
from agno.media import Image
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.0-flash"),
    markdown=True,
)

agent.print_response(
    "Tell me about this image.",
    images=[Image(url="https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg")],
)

See the following examples:

Image Generation

Generate and edit images using Gemini’s native image generation. See Google’s image generation documentation for more details.

from io import BytesIO
from agno.agent import Agent, RunOutput
from agno.models.google import Gemini
from PIL import Image

agent = Agent(
    model=Gemini(
        id="gemini-2.5-flash-image",
        response_modalities=["Text", "Image"],
    )
)

run_response = agent.run("Make me an image of a cat in a tree.")

if run_response and isinstance(run_response, RunOutput) and run_response.images:
    for image_response in run_response.images:
        image_bytes = image_response.content
        if image_bytes:
            image = Image.open(BytesIO(image_bytes))
            image.save("generated_image.png")

Grounding & Search

Gemini models support grounding and search capabilities that enable real-time web access. See more details in Google’s documentation. Enable web search by setting search=True:

from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-exp", search=True),
    markdown=True,
)

agent.print_response("What are the latest developments in AI?")

For legacy models, use grounding=True instead:

agent = Agent(
    model=Gemini(
        id="gemini-2.0-flash",
        grounding=True,
        grounding_dynamic_threshold=0.7,  # Optional: set threshold
    ),
)

Read more about search and grounding here.

Vertex AI Search

Search over your private knowledge base using Vertex AI. See Vertex AI Search documentation for setup details.

from agno.agent import Agent
from agno.models.google import Gemini

datastore_id = "projects/your-project-id/locations/global/collections/default_collection/dataStores/your-datastore-id"

agent = Agent(
    model=Gemini(
        id="gemini-2.5-flash",
        vertexai=True,
        vertexai_search=True,
        vertexai_search_datastore=datastore_id,
    ),
    markdown=True,
)

agent.print_response("What are our company's policies regarding remote work?")

URL Context

Extract and analyze content from URLs. See Google’s URL context documentation for more details.

from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.5-flash", url_context=True),
    markdown=True,
)

url1 = "https://www.foodnetwork.com/recipes/ina-garten/perfect-roast-chicken-recipe-1940592"
url2 = "https://www.allrecipes.com/recipe/83557/juicy-roasted-chicken/"

agent.print_response(
    f"Compare the ingredients and cooking times from the recipes at {url1} and {url2}"
)

File Search

Gemini’s File Search enables RAG over your documents with automatic chunking and retrieval. See Google’s File Search documentation for more details.

from pathlib import Path
from agno.agent import Agent
from agno.models.google import Gemini

model = Gemini(id="gemini-2.5-flash")
agent = Agent(model=model, markdown=True)

# Create a File Search store and upload documents
store = model.create_file_search_store(display_name="My Docs")
operation = model.upload_to_file_search_store(
    file_path=Path("documents/sample.txt"),
    store_name=store.name,
    display_name="Sample Document",
)
model.wait_for_operation(operation)

# Configure model to use File Search
model.file_search_store_names = [store.name]

# Query the documents
run = agent.run("What are the key points in the document?")
print(run.content)

# Cleanup
model.delete_file_search_store(store.name)

Speech Generation

Generate audio responses from the model. See Google’s speech generation documentation for available voices and options.

from agno.agent import Agent
from agno.models.google import Gemini
from agno.utils.audio import write_wav_audio_to_file

agent = Agent(
    model=Gemini(
        id="gemini-2.5-flash-preview-tts",
        response_modalities=["AUDIO"],
        speech_config={
            "voice_config": {"prebuilt_voice_config": {"voice_name": "Kore"}}
        },
    )
)

run_output = agent.run("Say cheerfully: Have a wonderful day!")

if run_output.response_audio is not None:
    audio_data = run_output.response_audio.content
    write_wav_audio_to_file("tmp/cheerful_greeting.wav", audio_data)

Context Caching

Cache large contexts to reduce costs and latency. See Google’s context caching documentation for more details.

from agno.agent import Agent
from agno.models.google import Gemini
from google import genai

client = genai.Client()

# Upload file and create cache
txt_file = client.files.upload(file="large_document.txt")
cache = client.caches.create(
    model="gemini-2.0-flash-001",
    config={
        "system_instruction": "You are an expert at analyzing transcripts.",
        "contents": [txt_file],
        "ttl": "300s",
    },
)

# Use the cached content - no need to resend the file
agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001", cached_content=cache.name),
)
run_output = agent.run("Find a lighthearted moment from this transcript")

Thinking Models

Gemini 2.5+ models support extended thinking for complex reasoning tasks. See Google’s thinking documentation for more details.

from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.5-pro", thinking_budget=1280, include_thoughts=True),
    markdown=True,
)

agent.print_response("Solve this logic puzzle...")

You can also use thinking_level for simpler control:

agent = Agent(
    model=Gemini(id="gemini-3-pro-preview", thinking_level="low"),  # "low" or "high"
    markdown=True,
)

Structured Outputs

Gemini supports native structured outputs using Pydantic models:

from agno.agent import Agent
from agno.models.google import Gemini
from pydantic import BaseModel

class MovieScript(BaseModel):
    name: str
    genre: str
    storyline: str

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    output_schema=MovieScript,
)

Read more about structured outputs here.

Tool Use

Gemini supports function calling to interact with external tools and APIs:

from agno.agent import Agent
from agno.models.google import Gemini
from agno.tools.duckduckgo import DuckDuckGoTools

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    tools=[DuckDuckGoTools()],
    markdown=True,
)

agent.print_response("Whats happening in France?")

Params

Parameter	Type	Default	Description
`id`	`str`	`"gemini-2.0-flash-001"`	The id of the Gemini model to use
`name`	`str`	`"Gemini"`	The name of the model
`provider`	`str`	`"Google"`	The provider of the model
`api_key`	`Optional[str]`	`None`	Google API key (defaults to `GOOGLE_API_KEY` env var)
`vertexai`	`bool`	`False`	Use Vertex AI instead of AI Studio
`project_id`	`Optional[str]`	`None`	Google Cloud project ID for Vertex AI
`location`	`Optional[str]`	`None`	Google Cloud region for Vertex AI
`temperature`	`Optional[float]`	`None`	Controls randomness in the model’s output
`top_p`	`Optional[float]`	`None`	Controls diversity via nucleus sampling
`top_k`	`Optional[int]`	`None`	Controls diversity via top-k sampling
`max_output_tokens`	`Optional[int]`	`None`	Maximum number of tokens to generate
`stop_sequences`	`Optional[list[str]]`	`None`	Sequences where the model should stop generating
`seed`	`Optional[int]`	`None`	Random seed for reproducibility
`logprobs`	`Optional[bool]`	`None`	Whether to return log probabilities of output tokens
`presence_penalty`	`Optional[float]`	`None`	Penalizes new tokens based on whether they appear in the text so far
`frequency_penalty`	`Optional[float]`	`None`	Penalizes new tokens based on their frequency in the text so far
`search`	`bool`	`False`	Enable Google Search grounding
`grounding`	`bool`	`False`	Enable legacy grounding (use `search` for 2.0+)
`grounding_dynamic_threshold`	`Optional[float]`	`None`	Dynamic threshold for grounding
`url_context`	`bool`	`False`	Enable URL context extraction
`vertexai_search`	`bool`	`False`	Enable Vertex AI Search
`vertexai_search_datastore`	`Optional[str]`	`None`	Vertex AI Search datastore path
`file_search_store_names`	`Optional[list[str]]`	`None`	File Search store names for RAG
`file_search_metadata_filter`	`Optional[str]`	`None`	Metadata filter for File Search
`response_modalities`	`Optional[list[str]]`	`None`	Output types: `"TEXT"`, `"IMAGE"`, `"AUDIO"`
`speech_config`	`Optional[dict]`	`None`	TTS voice configuration
`thinking_budget`	`Optional[int]`	`None`	Token budget for reasoning (Gemini 2.5+)
`include_thoughts`	`Optional[bool]`	`None`	Include thought summaries in response
`thinking_level`	`Optional[str]`	`None`	Thinking intensity: `"low"` or `"high"`
`cached_content`	`Optional[Any]`	`None`	Reference to cached context
`safety_settings`	`Optional[list]`	`None`	Content safety configuration
`function_declarations`	`Optional[List[Any]]`	`None`	List of function declarations for the model
`generation_config`	`Optional[Any]`	`None`	Custom generation configuration
`generative_model_kwargs`	`Optional[Dict[str, Any]]`	`None`	Additional keyword arguments for the generative model
`request_params`	`Optional[Dict[str, Any]]`	`None`	Additional parameters for the request
`client_params`	`Optional[Dict[str, Any]]`	`None`	Additional parameters for client configuration

Gemini is a subclass of the Model class and has access to the same params.

Get Started

Basics

Context Management

Execution Control

Additional Features

Integrations

Help

Model Recommendations

Installation

Authentication

Google AI Studio

Vertex AI

Example

Capabilities

Multimodal Input

Image Generation

Grounding & Search

File Search

Speech Generation

Thinking Models

Multimodal Input

Image Generation

Grounding & Search

Vertex AI Search

URL Context

File Search

Speech Generation

Context Caching

Thinking Models

Structured Outputs

Tool Use

Params

Get Started

Basics

Context Management

Execution Control

Additional Features

Integrations

Help

​Model Recommendations

​Installation

​Authentication

​Google AI Studio

​Vertex AI

​Example

​Capabilities

Multimodal Input

Image Generation

Grounding & Search

File Search

Speech Generation

Thinking Models

​Multimodal Input

​Image Generation

​Grounding & Search

​Vertex AI Search

​URL Context

​File Search

​Speech Generation

​Context Caching

​Thinking Models

​Structured Outputs

​Tool Use

​Params

Model Recommendations

Installation

Authentication

Google AI Studio

Vertex AI

Example

Capabilities

Multimodal Input

Image Generation

Grounding & Search

Vertex AI Search

URL Context

File Search

Speech Generation

Context Caching

Thinking Models

Structured Outputs

Tool Use

Params