The Problem: Verbose Tool Results
If you are using tools with large response sizes, without compression, tool results quickly consume your context window:| Component | Cumulative Token Count | Notes |
|---|---|---|
| System Prompt | 1,200 tokens | |
| User Message | 1,300 tokens | |
| LLM Response | 1,500 tokens | |
| Tool Call 1 | 2,500 tokens | |
| Tool Call 2 | 5,700 tokens | 2,500 + 3,200 new |
| Tool Call 3 | 8,500 tokens | 5,700 + 2,800 new |
| Tool Call 4 | 12,000 tokens | 8,500 + 3,500 new |
The Solution: Automatic Compression
Context compression summarizes tool results after a threshold:- Dramatically reduced token costs
- Stay within context window limits
- Preserve critical facts and data
- Automatic compression
How It Works
Context compression follows a simple pattern:1
Enable Compression
Set
compress_tool_results=True on your agent or team. This comes with a default threshold of 3 tool calls. The system monitors tool call results as they come in.2
Threshold Reached
After the threshold is reached, compression is triggered. Each uncompressed tool call result is individually summarized.
3
Intelligent Summarization
The compression model preserves key facts (numbers, dates, entities, URLs) while removing boilerplate, redundancy, and filler text.
4
The LLM loop continues
The compressed tool results are used in the next LLM executions, reducing token usage and extending the life of your context window.
When using
arun on Agent or Team, compression is handled asynchronously and the uncompressed tool call results are summarised concurrently.Enable Compression
Turn oncompress_tool_results=True to automatically compress tool results. This comes with a default threshold of 3 tool calls.
For example:
You can also enable
compress_tool_results=True on individual team members to compress their tool results independently.Custom Compression
Provide aCompressionManager to customize the compression behavior:
When to Use Context Compression
Perfect for:- Agents with tools that return verbose results (web search, APIs)
- Multi-step workflows with many tool calls
- Long-running sessions where context accumulates
- Production systems where cost matters
Developer Resources
- CompressionManager Reference - Full CompressionManager documentation
- Agent Reference - Agent parameter documentation
- Team Reference - Team parameter documentation
- Cookbook Examples