Skip to main content
Gemini is a family of multimodal AI models by Google that can understand and generate text, images, audio, video, and code. See their model options here. Gemini stands out with native multimodal understanding across images, video, and audio, built-in Google Search for real-time information, File Search for RAG over your documents, native image generation and editing, text-to-speech synthesis, and advanced reasoning with thinking models.

Model Recommendations

ModelBest ForKey Strengths
gemini-2.0-flashMost use-casesBalanced speed and intelligence
gemini-2.0-flash-liteHigh-volume tasksMost cost-effective
gemini-2.5-proComplex tasksAdvanced reasoning, largest context
gemini-3-pro-previewLatest featuresThought signatures support
Google has rate limits on their APIs. See the docs for more information.

Installation

pip install google-genai agno

Authentication

There are two ways to use the Gemini class: via Google AI Studio (using GOOGLE_API_KEY) or via Vertex AI (using Google Cloud credentials).

Google AI Studio

Set the GOOGLE_API_KEY environment variable. You can get one from Google AI Studio.
export GOOGLE_API_KEY=***

Vertex AI

To use Vertex AI in Google Cloud:
  1. Refer to the Vertex AI documentation to set up a project and development environment.
  2. Install the gcloud CLI and authenticate (refer to the quickstart for more details):
gcloud auth application-default login
  1. Enable Vertex AI API and set the project ID environment variable (alternatively, you can set project_id in the Agent config):
Export the following variables:
export GOOGLE_GENAI_USE_VERTEXAI="true"
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
Or configure directly in your agent:
from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(
        id="gemini-2.0-flash",
        vertexai=True,
        project_id="your-project-id",
        location="us-central1",
    ),
)
Read more about Vertex AI setup here.

Example

Use Gemini with your Agent:
from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    markdown=True,
)

# Print the response in the terminal
agent.print_response("Share a 2 sentence horror story.")
View more examples here.

Capabilities

Multimodal Input

Gemini natively understands images, video, audio, and documents. See Google’s vision documentation for supported formats and limits.
from agno.agent import Agent
from agno.media import Image
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.0-flash"),
    markdown=True,
)

agent.print_response(
    "Tell me about this image.",
    images=[Image(url="https://upload.wikimedia.org/wikipedia/commons/b/bf/Krakow_-_Kosciol_Mariacki.jpg")],
)
See the following examples:

Image Generation

Generate and edit images using Gemini’s native image generation. See Google’s image generation documentation for more details.
from io import BytesIO
from agno.agent import Agent, RunOutput
from agno.models.google import Gemini
from PIL import Image

agent = Agent(
    model=Gemini(
        id="gemini-2.5-flash-image",
        response_modalities=["Text", "Image"],
    )
)

run_response = agent.run("Make me an image of a cat in a tree.")

if run_response and isinstance(run_response, RunOutput) and run_response.images:
    for image_response in run_response.images:
        image_bytes = image_response.content
        if image_bytes:
            image = Image.open(BytesIO(image_bytes))
            image.save("generated_image.png")
Read more about image generation here. Gemini models support grounding and search capabilities that enable real-time web access. See more details in Google’s documentation. Enable web search by setting search=True:
from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-exp", search=True),
    markdown=True,
)

agent.print_response("What are the latest developments in AI?")
For legacy models, use grounding=True instead:
agent = Agent(
    model=Gemini(
        id="gemini-2.0-flash",
        grounding=True,
        grounding_dynamic_threshold=0.7,  # Optional: set threshold
    ),
)
Read more about search and grounding here. Search over your private knowledge base using Vertex AI. See Vertex AI Search documentation for setup details.
from agno.agent import Agent
from agno.models.google import Gemini

datastore_id = "projects/your-project-id/locations/global/collections/default_collection/dataStores/your-datastore-id"

agent = Agent(
    model=Gemini(
        id="gemini-2.5-flash",
        vertexai=True,
        vertexai_search=True,
        vertexai_search_datastore=datastore_id,
    ),
    markdown=True,
)

agent.print_response("What are our company's policies regarding remote work?")

URL Context

Extract and analyze content from URLs. See Google’s URL context documentation for more details.
from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.5-flash", url_context=True),
    markdown=True,
)

url1 = "https://www.foodnetwork.com/recipes/ina-garten/perfect-roast-chicken-recipe-1940592"
url2 = "https://www.allrecipes.com/recipe/83557/juicy-roasted-chicken/"

agent.print_response(
    f"Compare the ingredients and cooking times from the recipes at {url1} and {url2}"
)
Read more about URL context here. Gemini’s File Search enables RAG over your documents with automatic chunking and retrieval. See Google’s File Search documentation for more details.
from pathlib import Path
from agno.agent import Agent
from agno.models.google import Gemini

model = Gemini(id="gemini-2.5-flash")
agent = Agent(model=model, markdown=True)

# Create a File Search store and upload documents
store = model.create_file_search_store(display_name="My Docs")
operation = model.upload_to_file_search_store(
    file_path=Path("documents/sample.txt"),
    store_name=store.name,
    display_name="Sample Document",
)
model.wait_for_operation(operation)

# Configure model to use File Search
model.file_search_store_names = [store.name]

# Query the documents
run = agent.run("What are the key points in the document?")
print(run.content)

# Cleanup
model.delete_file_search_store(store.name)

Speech Generation

Generate audio responses from the model. See Google’s speech generation documentation for available voices and options.
from agno.agent import Agent
from agno.models.google import Gemini
from agno.utils.audio import write_wav_audio_to_file

agent = Agent(
    model=Gemini(
        id="gemini-2.5-flash-preview-tts",
        response_modalities=["AUDIO"],
        speech_config={
            "voice_config": {"prebuilt_voice_config": {"voice_name": "Kore"}}
        },
    )
)

run_output = agent.run("Say cheerfully: Have a wonderful day!")

if run_output.response_audio is not None:
    audio_data = run_output.response_audio.content
    write_wav_audio_to_file("tmp/cheerful_greeting.wav", audio_data)

Context Caching

Cache large contexts to reduce costs and latency. See Google’s context caching documentation for more details.
from agno.agent import Agent
from agno.models.google import Gemini
from google import genai

client = genai.Client()

# Upload file and create cache
txt_file = client.files.upload(file="large_document.txt")
cache = client.caches.create(
    model="gemini-2.0-flash-001",
    config={
        "system_instruction": "You are an expert at analyzing transcripts.",
        "contents": [txt_file],
        "ttl": "300s",
    },
)

# Use the cached content - no need to resend the file
agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001", cached_content=cache.name),
)
run_output = agent.run("Find a lighthearted moment from this transcript")

Thinking Models

Gemini 2.5+ models support extended thinking for complex reasoning tasks. See Google’s thinking documentation for more details.
from agno.agent import Agent
from agno.models.google import Gemini

agent = Agent(
    model=Gemini(id="gemini-2.5-pro", thinking_budget=1280, include_thoughts=True),
    markdown=True,
)

agent.print_response("Solve this logic puzzle...")
You can also use thinking_level for simpler control:
agent = Agent(
    model=Gemini(id="gemini-3-pro-preview", thinking_level="low"),  # "low" or "high"
    markdown=True,
)
Read more about thinking models here.

Structured Outputs

Gemini supports native structured outputs using Pydantic models:
from agno.agent import Agent
from agno.models.google import Gemini
from pydantic import BaseModel

class MovieScript(BaseModel):
    name: str
    genre: str
    storyline: str

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    output_schema=MovieScript,
)
Read more about structured outputs here.

Tool Use

Gemini supports function calling to interact with external tools and APIs:
from agno.agent import Agent
from agno.models.google import Gemini
from agno.tools.duckduckgo import DuckDuckGoTools

agent = Agent(
    model=Gemini(id="gemini-2.0-flash-001"),
    tools=[DuckDuckGoTools()],
    markdown=True,
)

agent.print_response("Whats happening in France?")
Read more about tool use here.

Params

ParameterTypeDefaultDescription
idstr"gemini-2.0-flash-001"The id of the Gemini model to use
namestr"Gemini"The name of the model
providerstr"Google"The provider of the model
api_keyOptional[str]NoneGoogle API key (defaults to GOOGLE_API_KEY env var)
vertexaiboolFalseUse Vertex AI instead of AI Studio
project_idOptional[str]NoneGoogle Cloud project ID for Vertex AI
locationOptional[str]NoneGoogle Cloud region for Vertex AI
temperatureOptional[float]NoneControls randomness in the model’s output
top_pOptional[float]NoneControls diversity via nucleus sampling
top_kOptional[int]NoneControls diversity via top-k sampling
max_output_tokensOptional[int]NoneMaximum number of tokens to generate
stop_sequencesOptional[list[str]]NoneSequences where the model should stop generating
seedOptional[int]NoneRandom seed for reproducibility
logprobsOptional[bool]NoneWhether to return log probabilities of output tokens
presence_penaltyOptional[float]NonePenalizes new tokens based on whether they appear in the text so far
frequency_penaltyOptional[float]NonePenalizes new tokens based on their frequency in the text so far
searchboolFalseEnable Google Search grounding
groundingboolFalseEnable legacy grounding (use search for 2.0+)
grounding_dynamic_thresholdOptional[float]NoneDynamic threshold for grounding
url_contextboolFalseEnable URL context extraction
vertexai_searchboolFalseEnable Vertex AI Search
vertexai_search_datastoreOptional[str]NoneVertex AI Search datastore path
file_search_store_namesOptional[list[str]]NoneFile Search store names for RAG
file_search_metadata_filterOptional[str]NoneMetadata filter for File Search
response_modalitiesOptional[list[str]]NoneOutput types: "TEXT", "IMAGE", "AUDIO"
speech_configOptional[dict]NoneTTS voice configuration
thinking_budgetOptional[int]NoneToken budget for reasoning (Gemini 2.5+)
include_thoughtsOptional[bool]NoneInclude thought summaries in response
thinking_levelOptional[str]NoneThinking intensity: "low" or "high"
cached_contentOptional[Any]NoneReference to cached context
safety_settingsOptional[list]NoneContent safety configuration
function_declarationsOptional[List[Any]]NoneList of function declarations for the model
generation_configOptional[Any]NoneCustom generation configuration
generative_model_kwargsOptional[Dict[str, Any]]NoneAdditional keyword arguments for the generative model
request_paramsOptional[Dict[str, Any]]NoneAdditional parameters for the request
client_paramsOptional[Dict[str, Any]]NoneAdditional parameters for client configuration
Gemini is a subclass of the Model class and has access to the same params.