Skip to main content
Agno Knowledge uses content as the building block of any piece of knowledge. Content can be added to knowledge from different sources.
Content OriginDescription
PathLocal files or directories containing files
UrlDirect links to files or other sites
TextRaw text content
TopicSearch topics from repositories like Arxiv or Wikipedia
Remote ContentContent from cloud storage providers like S3, GCS, SharePoint, GitHub, and Azure Blob
Knowledge content needs to be read and chunked before it can be passed to any VectorDB for embedding, storage and ultimately, retrieval. When content is added to Knowledge, a default reader is selected. Readers are used to parse content from the origin and then chunk it into smaller pieces that will then be embedded by the VectorDB. Custom readers or an override to the default reader and/or its settings can be passed when adding the content. In the below example, an instance of the standard PDFReader class is created but we update the chunk_size. Similarly, we can update the chunking_strategy and other parameters that will influence how content is ingested and processed.
from agno.knowledge.reader.pdf_reader import PDFReader

reader = PDFReader(
    chunk_size=1000,
)

knowledge_base = Knowledge(
    vector_db=vector_db,
)

asyncio.run(
        knowledge_base.ainsert(
            path="data/pdf",
            reader=reader
        )
    )
For more information about the different readers and their capabilities checkout the Readers page.

Next Steps

Search & Retrieval

Learn how agents search and find information in your knowledge base

Readers

Explore content parsing and ingestion options in detail

Chunking Strategies

Optimize how content is broken down for better search results

Vector Databases

Choose the right storage solution for your knowledge base