Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

RAG Sample: Introduction to PDF Retrieval with SurrealDB

Overview

This sample demonstrates a full Rust-based Retrieval-Augmented Generation (RAG) workflow that:

  • loads PDF text and generates vector embeddings for each page using Ollama,
  • stores the extracted page content and embeddings in SurrealDB,
  • performs semantic vector search to retrieve relevant chunks for a question,
  • synthesizes a final answer using an LLM.

The SurrealDB server and Ollama service should be running locally. The Rust workspace includes the rag_surrealdb module for loading and querying data.

What This Project Does

  • Starts a local SurrealDB service with Docker Compose
  • Uses Ollama (llama3.1) to generate embeddings for document chunks
  • Inserts PDF page chunks and their embeddings into SurrealDB
  • Performs semantic search using SurrealDB’s vector search (HNSW index)
  • Generates a natural language answer using an LLM via Ollama

Plan

  1. Add a new Rust workspace member named rag_surrealdb.
  2. Add pdf-extract to the workspace dependencies.
  3. Create a Docker Compose configuration for SurrealDB.
  4. Implement rag_surrealdb/src/main.rs with two modes:
    • load <path-to-pdf>: read the PDF and insert page chunks into SurrealDB
    • ask "<question>": retrieve relevant PDF chunks from SurrealDB
  5. Update documentation with the instructions and sample commands.

How It Works

Loading Process

When you run the load command, the application performs the following steps:

  1. Text Extraction: Uses pdf-extract to read the PDF file and split it into individual pages.
  2. Indexing: Ensures a vector index (HNSW) is defined in SurrealDB for the embedding field.
  3. Embedding Generation: For each page, it sends the text to Ollama (llama3.1) to generate a 4096-dimensional vector embedding.
  4. Storage: Stores the page text, metadata (source file, page number), and the embedding as a record in the chunk table in SurrealDB.
sequenceDiagram
    participant CLI as rag_surrealdb load
    participant PDF as PDF File
    participant O as Ollama (llama3.1)
    participant S as SurrealDB

    CLI->>PDF: Extract text by pages
    CLI->>S: DEFINE INDEX (HNSW)
    loop For each page
        CLI->>O: Get embedding for page text
        O-->>CLI: 4096-D Vector
        CLI->>S: CREATE chunk (text + embedding)
    end

Querying Process (RAG)

When you run the ask command, the application executes the RAG workflow:

  1. Question Embedding: Generates a vector embedding for your question using Ollama.
  2. Semantic Search: Performs a K-Nearest Neighbors (KNN) search in SurrealDB to find the top 3 most relevant text chunks based on vector distance.
  3. Context Construction: Combines the retrieved text chunks into a single context block.
  4. Answer Synthesis: Sends the context and your question to Ollama. The LLM uses the provided context to generate a factual answer.
sequenceDiagram
    participant CLI as rag_surrealdb ask
    participant O as Ollama (llama3.1)
    participant S as SurrealDB

    CLI->>O: Get embedding for question
    O-->>CLI: Question Vector
    CLI->>S: Vector Search (KNN)
    S-->>CLI: Top 3 relevant chunks
    CLI->>O: Generate answer (Context + Question)
    O-->>CLI: Synthesized Answer
    CLI->>User: Display Answer

Setup

Start Services

  1. Start SurrealDB from the repository root:
docker compose up -d surrealdb
  1. Ensure Ollama is running and has the llama3.1 model:
ollama run llama3.1

Load a PDF

cargo run -p rag_surrealdb -- load path/to/document.pdf

Sample PDF

Use the included sample file:

cargo run -p rag_surrealdb -- load data/the-tale-of-peter-rabbit.pdf

Ask a Question

cargo run -p rag_surrealdb -- ask "Who is Peter?"

Notes

  • The sample uses vector embeddings (4096-D) for semantic retrieval.
  • It leverages SurrealDB’s HNSW index for efficient similarity search.
  • The final answer is generated by an LLM (llama3.1 via Ollama) using the retrieved context.
  • The database namespace is rag and the database name is sample.

File Structure

rag_surrealdb/
├── Cargo.toml
└── src/
    └── main.rs

Next Steps

  • Implement chunking strategies (e.g., fixed-size with overlap) instead of page-level chunks.
  • Add support for multiple PDF documents and filtered search.
  • Explore hybrid search (combining full-text and vector search) for better accuracy.