RAG with BrilliantAI & LangChain

Build powerful Retrieval-Augmented Generation systems with BrilliantAI and LangChain.

Overview

Retrieval-Augmented Generation (RAG) enhances LLM responses by incorporating relevant information from external data sources. This tutorial shows you how to build a RAG system using BrilliantAI and LangChain that can process both PDF documents and web content.

Prerequisites

Python 3.8+
A BrilliantAI API key
Basic knowledge of Python and LLMs

Installation

First, install the required packages:

pip install openai langchain langchain_openai qdrant_client pypdf bs4 requests

Project Structure

We'll create a RAG application with the following components:

Document loading and processing
Text chunking
Embedding generation
Vector storage
Query processing
Response generation

Step 1: Set Up Environment

Create a new Python file and import the necessary libraries:

import os
import re
import requests
from bs4 import BeautifulSoup
from typing import List, Dict, Any

from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from langchain_community.vectorstores import Qdrant

from openai import OpenAI

Step 2: Configure Vector Storage

Initialize BrilliantAI and Qdrant for vector storage:

# Initialize BrilliantAI client
embeddings = OpenAIEmbeddings(
    base_url="https://api.brilliantai.co",
    api_key="YOUR_BRILLIANTAI_API_KEY"
)

# Initialize Qdrant client (local)
from qdrant_client import QdrantClient
from qdrant_client.models import Distance, VectorParams

qdrant_client = QdrantClient(":memory:")  # In-memory storage for this example
qdrant_client.create_collection(
    collection_name="documents",
    vectors_config=VectorParams(size=1536, distance=Distance.COSINE)
)

Step 3: Document Processing

Create functions to load and process documents:

def load_pdf(file_path: str) -> List[str]:
    """Load and extract text from a PDF file."""
    loader = PyPDFLoader(file_path)
    documents = loader.load()
    return [doc.page_content for doc in documents]

def load_web_page(url: str) -> str:
    """Load and extract text from a web page."""
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # Remove script and style elements
    for script in soup(["script", "style"]):
        script.extract()
    
    # Get text
    text = soup.get_text()
    
    # Clean text
    lines = (line.strip() for line in text.splitlines())
    chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
    text = '\n'.join(chunk for chunk in chunks if chunk)
    
    return text

def split_text(text: str) -> List[str]:
    """Split text into chunks."""
    text_splitter = RecursiveCharacterTextSplitter(
        chunk_size=1000,
        chunk_overlap=200,
        length_function=len,
    )
    return text_splitter.split_text(text)

Step 4: Indexing Documents

Create a function to index documents in the vector store:

def index_document(content: List[str], metadata: Dict[str, Any] = None):
    """Index document chunks in the vector store."""
    if metadata is None:
        metadata = {}
    
    # Create metadata for each chunk
    metadatas = [metadata for _ in range(len(content))]
    
    # Create Qdrant vector store
    vectorstore = Qdrant(
        client=qdrant_client,
        collection_name="documents",
        embeddings=embeddings,
    )
    
    # Add documents to the vector store
    vectorstore.add_texts(texts=content, metadatas=metadatas)
    
    return vectorstore

Step 5: Query Processing

Create a function to retrieve relevant documents:

def retrieve_documents(query: str, k: int = 3):
    """Retrieve relevant documents for a query."""
    vectorstore = Qdrant(
        client=qdrant_client,
        collection_name="documents",
        embeddings=embeddings,
    )
    
    # Search for similar documents
    docs = vectorstore.similarity_search(query, k=k)
    
    return docs

Step 6: RAG Implementation

Create a RAG class that combines all components:

class RAGSystem:
    def __init__(self, api_key: str = None):
        """Initialize the RAG system."""
        # Get API key from environment variable or parameter
        api_key = api_key or os.getenv("BRILLIANTAI_API_KEY")
        if not api_key:
            raise ValueError("Please set 'BRILLIANTAI_API_KEY' in environment variables.")
        
        # Initialize OpenAI client
        self.client = OpenAI(base_url="https://api.brilliantai.co", api_key=api_key)
        
        # Initialize conversation history
        self.conversation_history = []
    
    def add_pdf(self, file_path: str):
        """Add a PDF document to the system."""
        # Load PDF
        content = load_pdf(file_path)
        
        # Split into chunks
        chunks = []
        for page in content:
            chunks.extend(split_text(page))
        
        # Index chunks
        index_document(chunks, {"source": file_path, "type": "pdf"})
    
    def add_web_page(self, url: str):
        """Add a web page to the system."""
        # Load web page
        content = load_web_page(url)
        
        # Split into chunks
        chunks = split_text(content)
        
        # Index chunks
        index_document(chunks, {"source": url, "type": "web"})
    
    def query(self, user_query: str) -> str:
        """Process a user query and generate a response."""
        # Retrieve relevant documents
        docs = retrieve_documents(user_query)
        
        # Extract text from documents
        context = "\n\n".join([doc.page_content for doc in docs])
        
        # Add user query to conversation history
        self.conversation_history.append({"role": "user", "content": user_query})
        
        # Create prompt with context
        system_message = f"""You are a helpful assistant. Answer the user's question based on the following context:
        
{context}

If the context doesn't contain the answer, say that you don't know based on the available information."""
        
        # Generate response
        messages = [
            {"role": "system", "content": system_message},
            *self.conversation_history
        ]
        
        response = self.client.chat.completions.create(
            model="llama-3-70b",
            messages=messages,
            temperature=0.7,
        )
        
        # Extract response text
        answer = response.choices[0].message.content
        
        # Add assistant response to conversation history
        self.conversation_history.append({"role": "assistant", "content": answer})
        
        return answer

Step 7: Using the RAG System

Here's how to use the RAG system:

# Initialize RAG system
rag = RAGSystem()

# Add documents
rag.add_pdf("document.pdf")
rag.add_web_page("https://example.com")

# Query the system
response = rag.query("What is RAG?")
print(response)

# Follow-up question
response = rag.query("Can you provide more details?")
print(response)

Code Explanation

api_key = os.getenv("BRILLIANTAI_API_KEY")
Retrieves the BRILLIANTAI_API_KEY environment variable.
if not api_key: raise ValueError("Please set 'BRILLIANTAI_API_KEY' in environment variables.")
Ensures that an API key is provided.
self.client = OpenAI(base_url="https://api.brilliantai.co", api_key=api_key)
Initializes the OpenAI client with the BrilliantAI API endpoint.
def query(self, user_query: str) -> str:
Sends the conversation history to the BrilliantAI API to generate a response.

Conclusion

You've now built a complete RAG system using BrilliantAI and LangChain. This system can:

Process PDF documents and web pages
Split text into manageable chunks
Generate embeddings using BrilliantAI's embedding model
Store vectors in Qdrant
Retrieve relevant context for user queries
Generate responses using BrilliantAI's LLM

For production use, consider:

Using a persistent vector database
Implementing authentication and rate limiting
Adding error handling and logging
Optimizing chunk size and overlap for your specific use case

Overview​

Prerequisites​

Installation​

Project Structure​

Step 1: Set Up Environment​

Step 2: Configure Vector Storage​

Step 3: Document Processing​

Step 4: Indexing Documents​

Step 5: Query Processing​

Step 6: RAG Implementation​

Step 7: Using the RAG System​

Code Explanation​

Conclusion​

Additional Resources​