Skip to main content
This example demonstrates how to provide input to an agent as a dictionary format, specifically for multimodal inputs like text and images.

Code

input_as_dict.py
from agno.agent import Agent
from agno.models.openai import OpenAIChat

Agent(model=OpenAIChat(id="gpt-5-mini")).print_response(
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
                },
            },
        ],
    },
    stream=True,
    markdown=True,
)

Usage

1

Create a virtual environment

Open the Terminal and create a python virtual environment.
python3 -m venv .venv
source .venv/bin/activate
2

Install libraries

pip install -U agno openai
3

Export your OpenAI API key

  export OPENAI_API_KEY="your_openai_api_key_here"
4

Create a Python file

Create a Python file and add the above code.
touch input_as_dict.py
5

Run Agent

python input_as_dict.py
6

Find All Cookbooks

Explore all the available cookbooks in the Agno repository. Click the link below to view the code on GitHub:Agno Cookbooks on GitHub
I