How It Works
- Insert: When you add content, each chunk is converted to a vector
- Store: Vectors are saved in your vector database
- Search: Queries are embedded and matched against stored vectors by similarity
OpenAIEmbedder by default, but you can swap in any supported embedder.
Configuration
Using with Knowledge
Batch Embeddings
Process multiple texts in a single API call to reduce requests and improve performance:Best Practices
Supported Embedders
| Embedder | Type | Cost | Notes |
|---|---|---|---|
| OpenAI | Hosted | $$ | Default, excellent quality |
| Gemini | Hosted | $$ | Multilingual, Google ecosystem |
| Cohere | Hosted | $$ | Strong retrieval performance |
| Voyage AI | Hosted | $$$ | Specialized for retrieval |
| Mistral | Hosted | $$ | European provider |
| Ollama | Local | Free | Privacy, offline |
| FastEmbed | Local | Free | Fast local embeddings |
| HuggingFace | Local/Hosted | Free/$ | Open source models |
| AWS Bedrock | Hosted | $$ | AWS ecosystem |
| Azure OpenAI | Hosted | $$ | Azure ecosystem |
| Fireworks | Hosted | $ | Fast inference |
| Together | Hosted | $ | Open source models |
| Jina | Hosted | $$ | Multilingual |
| Nebius | Hosted | $ | European provider |
Choosing an Embedder
| Consideration | Recommendation |
|---|---|
| General use | OpenAI or Gemini |
| Privacy/offline | Ollama or FastEmbed |
| Multilingual | Gemini or Jina |
| Cost-sensitive | Local embedders (free) or Fireworks/Together ($) |
| Best retrieval quality | Voyage AI or Cohere |
- Hosted vs local: Local for privacy and no API costs; hosted for quality and convenience
- Latency and cost: Smaller models are cheaper and faster; larger models often retrieve better
- Language support: Ensure your embedder supports your content’s languages
- Dimension size: Match your vector database’s expected embedding dimensions
Next Steps
OpenAI Embedder
Default embedder setup
Ollama Embedder
Local embeddings for privacy
Vector DB
Store your embeddings
Chunking
Prepare content for embedding