- Streaming responses
- Tool calling
- Structured outputs
- Async execution
HuggingFace supports tool calling through the Agno framework, but not for streaming
responses.
Perplexity supports tool calling through the Agno framework, but their models don’t
natively support tool calls in a straightforward way. This means tool usage may
be less reliable compared to other providers.
Vercel V0 doesn’t support native structured output, but does support
use_json_mode=True
.Multimodal Support
Agno Supported Models | Image Input | Audio Input | Audio Responses | Video Input | File Upload |
---|---|---|---|---|---|
AIMLAPI | ✅ | ||||
Anthropic Claude | ✅ | ✅ | |||
AWS Bedrock | ✅ | ✅ | |||
AWS Bedrock Claude | ✅ | ✅ | |||
Azure AI Foundry | ✅ | ||||
Azure OpenAI | ✅ | ||||
Cerebras | |||||
Cerebras OpenAI | |||||
Cohere | ✅ | ||||
CometAPI | ✅ | ||||
DashScope | ✅ | ||||
DeepInfra | |||||
DeepSeek | |||||
Fireworks | |||||
Gemini | ✅ | ✅ | ✅ | ✅ | |
Groq | ✅ | ||||
HuggingFace | ✅ | ||||
IBM WatsonX | ✅ | ||||
InternLM | |||||
LangDB | ✅ | ✅ | |||
LiteLLM | ✅ | ✅ | |||
LiteLLMOpenAI | ✅ | ||||
LlamaCpp | |||||
LM Studio | ✅ | ||||
Llama | ✅ | ||||
LlamaOpenAI | ✅ | ||||
Mistral | ✅ | ||||
Nebius | |||||
Nexus | |||||
Nvidia | |||||
Ollama | ✅ | ||||
OpenAIChat | ✅ | ✅ | ✅ | ||
OpenAIResponses | ✅ | ✅ | ✅ | ✅ | |
OpenRouter | |||||
Perplexity | |||||
Portkey | |||||
Requesty | |||||
Sambanova | |||||
Siliconflow | |||||
Together | ✅ | ||||
Vercel V0 | |||||
VLLM | |||||
Vertex AI Claude | ✅ | ||||
XAI | ✅ |