Context Engineering

Context engineering is the process of designing and controlling the information (context) that is sent to language models to guide their behavior and outputs. In practice, building context comes down to one question: “Which information is most likely to achieve the desired outcome?” In Agno, this means carefully crafting the system message, which includes the agent’s description, instructions, and other relevant settings. By thoughtfully constructing this context, you can:

Steer the agent toward specific behaviors or roles.
Constrain or expand the agent’s capabilities.
Ensure outputs are consistent, relevant, and aligned with your application’s needs.
Enable advanced use cases such as multi-step reasoning, tool use, or structured output.

Effective context engineering is an iterative process: refining the system message, trying out different descriptions and instructions, and using features such as schemas, delegation, and tool integrations. The context of an Agno agent consists of the following:

System message: The system message is the main context that is sent to the agent, including all additional context
User message: The user message is the message that is sent to the agent.
Chat history: The chat history is the history of the conversation between the agent and the user.
Additional input: Any few-shot examples or other additional input that is added to the context.

System message context

The following are some key parameters that are used to create the system message:

Description: A description that guides the overall behaviour of the agent.
Instructions: A list of precise, task-specific instructions on how to achieve its goal.
Expected Output: A description of the expected output from the Agent.

The system message is built from the agent’s description, instructions, and other settings.

from agno.agent import Agent
from agno.models.openai import OpenAIChat

agent = Agent(
    model=OpenAIChat(id="gpt-5-mini"),
    description="You are a famous short story writer asked to write for a magazine",
    instructions=["Always write 2 sentence stories."],
    markdown=True,
    debug_mode=True,  # Set to True to view the detailed logs and see the compiled system message
)
agent.print_response("Tell me a horror story.", stream=True)

Will produce the following system message:

You are a famous short story writer asked to write for a magazine                                                                          
<instructions>                                                                                                                             
- Always write 2 sentence stories.                                                                                                         
</instructions>                                                                                                                            
                                                                                                                                            
<additional_information>                                                                                                                   
- Use markdown to format your answer
</additional_information>

System message Parameters

The Agent creates a default system message that can be customized using the following agent parameters:

Parameter	Type	Default	Description
`description`	`str`	`None`	A description of the Agent that is added to the start of the system message.
`instructions`	`List[str]`	`None`	List of instructions added to the system prompt in `<instructions>` tags. Default instructions are also created depending on values for `markdown`, `expected_output` etc.
`additional_context`	`str`	`None`	Additional context added to the end of the system message.
`expected_output`	`str`	`None`	Provide the expected output from the Agent. This is added to the end of the system message.
`markdown`	`bool`	`False`	Add an instruction to format the output using markdown.
`add_datetime_to_context`	`bool`	`False`	If True, add the current datetime to the prompt to give the agent a sense of time. This allows for relative times like “tomorrow” to be used in the prompt
`add_name_to_context`	`bool`	`False`	If True, add the name of the agent to the context.
`add_location_to_context`	`bool`	`False`	If True, add the location of the agent to the context. This allows for location-aware responses and local context.
`add_session_summary_to_context`	`bool`	`False`	If True, add the session summary to the context. See sessions for more information.
`add_memories_to_context`	`bool`	`False`	If True, add the user memories to the context. See memory for more information.
`add_session_state_to_context`	`bool`	`False`	If True, add the session state to the context. See state for more information.
`enable_agentic_knowledge_filters`	`bool`	`False`	If True, let the agent choose the knowledge filters. See knowledge for more information.
`system_message`	`str`	`None`	Override the default system message.
`build_context`	`bool`	`True`	Optionally disable the building of the context.

See the full Agent reference for more information.

How the system message is built

Lets take the following example agent:

from agno.agent import Agent

agent = Agent(
    name="Helpful Assistant",
    role="Assistant",
    description="You are a helpful assistant",
    instructions=["Help the user with their question"],
    additional_context="""
    Here is an example of how to answer the user's question: 
        Request: What is the capital of France?
        Response: The capital of France is Paris.
    """,
    expected_output="You should format your response with `Response: <response>`",
    markdown=True,
    add_datetime_to_context=True,
    add_location_to_context=True,
    add_name_to_context=True,
    add_session_summary_to_context=True,
    add_memories_to_context=True,
    add_session_state_to_context=True,
)

Below is the system message that will be built:

You are a helpful assistant
<your_role>
Assistant
</your_role>

<instructions>
  Help the user with their question
</instructions>

<additional_information>
Use markdown to format your answers.
The current time is 2025-09-30 12:00:00.
Your approximate location is: New York, NY, USA.
Your name is: Helpful Assistant.
</additional_information>

<expected_output>
  You should format your response with `Response: <response>`
</expected_output>

Here is an example of how to answer the user's question: 
    Request: What is the capital of France?
    Response: The capital of France is Paris.

You have access to memories from previous interactions with the user that you can use:

<memories_from_previous_interactions>
- User really likes Digimon and Japan.
- User really likes Japan.
- User likes coffee.
</memories_from_previous_interactions>

Note: this information is from previous interactions and may be updated in this conversation. You should always prefer information from this conversation over the past memories.

Here is a brief summary of your previous interactions:

<summary_of_previous_interactions>
The user asked about information about Digimon and Japan.
</summary_of_previous_interactions>

Note: this information is from previous interactions and may be outdated. You should ALWAYS prefer information from this conversation over the past summary.

<session_state> ... </session_state>

This example is exhaustive and illustrates what is possible with the system message, however in practice you would only use some of these settings.

Additional Context

You can add additional context to the end of the system message using the additional_context parameter. Here, additional_context adds a note to the system message indicating that the agent can access specific database tables.

from textwrap import dedent

from agno.agent import Agent
from agno.models.langdb import LangDB
from agno.tools.duckdb import DuckDbTools

duckdb_tools = DuckDbTools(
    create_tables=False, export_tables=False, summarize_tables=False
)
duckdb_tools.create_table_from_path(
    path="https://phidata-public.s3.amazonaws.com/demo_data/IMDB-Movie-Data.csv",
    table="movies",
)

agent = Agent(
    model=LangDB(id="llama3-1-70b-instruct-v1.0"),
    tools=[duckdb_tools],
    markdown=True,
    additional_context=dedent("""\
    You have access to the following tables:
    - movies: contains information about movies from IMDB.
    """),
)
agent.print_response("What is the average rating of movies?", stream=True)

Tool Instructions

If you are using a Toolkit on your agent, you can add tool instructions to the system message using the instructions parameter:

from agno.agent import Agent
from agno.tools.slack import SlackTools

slack_tools = SlackTools(
    instructions=["Use `send_message` to send a message to the user.  If the user specifies a thread, use `send_message_thread` to send a message to the thread."],
    add_instructions=True,
)
agent = Agent(
    tools=[slack_tools],
)

These instructions are injected into the system message after the <additional_information> tags.

Agentic Memories

If you have enable_agentic_memory set to True on your agent, the agent gets the ability to create/update user memories using tools. This adds the following to the system message:

<updating_user_memories>
- You have access to the `update_user_memory` tool that you can use to add new memories, update existing memories, delete memories, or clear all memories.
- If the user's message includes information that should be captured as a memory, use the `update_user_memory` tool to update your memory database.
- Memories should include details that could personalize ongoing interactions with the user.
- Use this tool to add new memories or update existing memories that you identify in the conversation.
- Use this tool if the user asks to update their memory, delete a memory, or clear all memories.
- If you use the `update_user_memory` tool, remember to pass on the response to the user.
</updating_user_memories>

Agentic Knowledge Filters

If you have knowledge enabled on your agent, you can let the agent choose the knowledge filters using the enable_agentic_knowledge_filters parameter. This will add the following to the system message:

The knowledge base contains documents with these metadata filters: [filter1, filter2, filter3].
Always use filters when the user query indicates specific metadata.

Examples:
1. If the user asks about a specific person like "Jordan Mitchell", you MUST use the search_knowledge_base tool with the filters parameter set to {{'<valid key like user_id>': '<valid value based on the user query>'}}.
2. If the user asks about a specific document type like "contracts", you MUST use the search_knowledge_base tool with the filters parameter set to {{'document_type': 'contract'}}.
4. If the user asks about a specific location like "documents from New York", you MUST use the search_knowledge_base tool with the filters parameter set to {{'<valid key like location>': 'New York'}}.

General Guidelines:
- Always analyze the user query to identify relevant metadata.
- Use the most specific filter(s) possible to narrow down results.
- If multiple filters are relevant, combine them in the filters parameter (e.g., {{'name': 'Jordan Mitchell', 'document_type': 'contract'}}).
- Ensure the filter keys match the valid metadata filters: [filter1, filter2, filter3].

You can use the search_knowledge_base tool to search the knowledge base and get the most relevant documents. Make sure to pass the filters as [Dict[str: Any]] to the tool. FOLLOW THIS STRUCTURE STRICTLY.

Learn about agentic knowledge filters in more detail in the knowledge filters section.

Set the system message directly

You can manually set the system message using the system_message parameter. This will ignore all other settings and use the system message you provide.

from agno.agent import Agent
agent.print_response("What is the capital of France?")

agent = Agent(system_message="Share a 2 sentence story about")
agent.print_response("Love in the year 12000.")

Some models via some model providers, like llama-3.2-11b-vision-preview on Groq, require no system message with other messages. To remove the system message, set build_context=False and system_message=None. Additionally, if markdown=True is set, it will add a system message, so either remove it or explicitly disable the system message.

User message context

The input sent to the Agent.run() or Agent.print_response() is used as the user message.

Additional user message context

You can add additional context to the user message using the following agent parameters: The following agent parameters configure how the user message is built:

add_knowledge_to_context
add_dependencies_to_context

from agno.agent import Agent
agent = Agent(add_knowledge_to_context=True, add_dependencies_to_context=True)
agent.print_response("What is the capital of France?", dependencies={"name": "John Doe"})

The user message that is sent to the model will look like this:

What is the capital of France?

Use the following references from the knowledge base if it helps:
<references>
- Reference 1
- Reference 2
</references>

<additional context>
{"name": "John Doe"}
</additional context>

See dependencies for how to do dependency injection for your user message.

Chat history

If you have database storage enabled on your agent, session history is automatically stored (see sessions). You can now add the history of the conversation to the context using add_history_to_context.

from agno.agent.agent import Agent
from agno.db.postgres import PostgresDb
from agno.models.openai import OpenAIChat

db_url = "postgresql+psycopg://ai:ai@localhost:5532/ai"

db = PostgresDb(db_url=db_url)

agent = Agent(
    model=OpenAIChat(id="gpt-5-mini"),
    db=db,
    session_id="chat_history",
    instructions="You are a helpful assistant that can answer questions about space and oceans.",
    add_history_to_context=True,
    num_history_runs=2,  # Optionally limit the number of history responses to add to the context
)

agent.print_response("Where is the sea of tranquility?")

agent.print_response("What was my first question?")

This will add the history of the conversation to the context, which can be used to provide context for the next message. See more details on sessions.

Few-shot learning with additional input

You can add entire additional messages to your agent’s context using the additional_input parameter. These messages are added to the context as if they were part of the conversation history. You can give your agent examples of how it should respond (also called “few-shot prompting”):

from agno.agent import Agent
from agno.models.message import Message
from agno.models.openai.chat import OpenAIChat

# Few-shot examples showing good support patterns
support_examples = [
    # Example 1: Simple issue resolution
    Message(role="user", content="I forgot my password and can't log in"),
    Message(
        role="assistant",
        content="""
    I'll help you reset your password right away.

    **Steps to Reset Your Password:**
    . Go to the login page and click "Forgot Password"
    . Enter your email address
    . Check your email for the reset link
    . Follow the link to create a new password
    . Use a strong password with uppercase, lowercase, numbers, and symbols

    f you don't receive the email within 5 minutes, please check your spam folder. Let me know if you need any assistance with these steps!
    """.strip(),
    ),
    # Example 2: Billing issue pattern
    Message(
        role="user",
        content="I've been charged twice for the same order and I'm frustrated!",
    ),
    Message(
        role="assistant",
        content="""
    I sincerely apologize for the billing error and the frustration this has caused you.

    **Immediate Action Plan:**
    1. I'll investigate your account to confirm the duplicate charge
    2. Process a full refund for the duplicate transaction
    3. Ensure this doesn't happen again by checking your payment method settings
    4. Provide you with a confirmation number once the refund is processed

    The refund typically takes 3-5 business days to appear on your statement. As an apology for this inconvenience, I'd also like to offer you a 10% discount on your next order.

    Is there anything else I can help you with regarding this issue?
    """.strip(),
    ),
    # Example 3: Technical support pattern
    Message(role="user", content="Your app keeps crashing when I try to upload photos"),
    Message(
        role="assistant",
        content="""
    I'm sorry you're experiencing crashes with photo uploads. Let me help you troubleshoot this issue.

    **Troubleshooting Steps:**
    1. **Check App Version:** Ensure you have the latest version from the app store
    2. **Restart the App:** Close completely and reopen the application
    3. **Check Storage:** Make sure you have sufficient device storage (at least 1GB free)
    4. **Photo Size:** Try uploading smaller photos (under 10MB each)
    5. **Network Connection:** Ensure you have a stable internet connection

    **If the issue persists:**
    - Try uploading one photo at a time instead of multiple
    - Clear the app cache in your device settings
    - Restart your device

    If none of these steps resolve the issue, please let me know your device type and OS version, and I'll escalate this to our technical team for further investigation.
    """.strip(),
    ),
]

if __name__ == "__main__":
    # Create agent with few-shot learning
    agent = Agent(
        name="Customer Support Specialist",
        model=OpenAIChat(id="gpt-5-mini"),
        add_name_to_context=True,
        additional_input=support_examples,  # few-shot learning examples
        instructions=[
            "You are an expert customer support specialist.",
            "Always be empathetic, professional, and solution-oriented.",
            "Provide clear, actionable steps to resolve customer issues.",
            "Follow the established patterns for consistent, high-quality support.",
        ],
        markdown=True,
    )

    for i, example in enumerate(support_examples, 1):
        print(f"📞 Example {i}: {example}")
        print("-" * 50)
        agent.print_response(example)

Context Caching

Most model providers support caching of system and user messages, though the implementation differs between providers. The general approach is to cache repetitive content and common instructions, and then reuse that cached content in subsequent requests as the prefix of your system message. In other words, if the model supports it, you can reduce the number of tokens sent to the model by putting static content at the start of your system message. Agno’s context construction is designed to place the most likely static content at the beginning of the system message.
If you wish to fine-tune this, the recommended approach is to manually set the system message. Some examples of prompt caching:

OpenAI’s prompt caching
Anthropic prompt caching -> See an Agno example of this
OpenRouter prompt caching

Developer Resources

View the Agent schema
View Cookbook

Introduction

Learn

Help

​System message context

​System message Parameters

​How the system message is built

​Additional Context

​Tool Instructions

​Agentic Memories

​Agentic Knowledge Filters

​Set the system message directly

​User message context

​Additional user message context

​Chat history

​Few-shot learning with additional input

​Context Caching

​Developer Resources

System message context

System message Parameters

How the system message is built

Additional Context

Tool Instructions

Agentic Memories

Agentic Knowledge Filters

Set the system message directly

User message context

Additional user message context

Chat history

Few-shot learning with additional input

Context Caching

Developer Resources