Large Language Models

Building a RAG Application with LangChain: Example Code

The combination of Retrieval-Augmented Generation (RAG) and powerful language models enables the development of sophisticated applications that leverage large datasets to answer questions effectively. In this blog, we will explore the steps to build an LLM RAG application using LangChain.


Prerequisites

Before diving into the implementation, ensure you have the required libraries installed. Execute the following command to install the necessary packages:

!pip install langchain langchain_community langchainhub langchain-openai tiktoken chromadb


Setting Up Environment Variables

LangChain integrates with various APIs to enable tracing and embedding generation, which are crucial for debugging workflows and creating compact numerical representations of text data for efficient retrieval and processing in RAG applications. Set up the required environment variables for LangChain and OpenAI:

import os
os.environ['LANGCHAIN_TRACING_V2'] = 'true'
os.environ['LANGCHAIN_ENDPOINT'] = 'https://api.smith.langchain.com'
os.environ['LANGCHAIN_API_KEY'] = '<langchain-api-key>'
os.environ['OPENAI_API_KEY'] = '<openai-api-key>'

Step 1: Indexing Content

Indexing is the process of preparing your dataset for retrieval. In this example, we load and process a blog post for indexing.

Loading the Blog Content

We use WebBaseLoader to scrape the content from a blog URL. In this case, the content is restricted to certain HTML classes using BeautifulSoup:

import bs4
from langchain_community.document_loaders import WebBaseLoader

loader = WebBaseLoader(
    web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",),
    bs_kwargs=dict(
        parse_only=bs4.SoupStrainer(
            class_=("post-content", "post-title", "post-header")
        )
    ),
)
blog_docs = loader.load()

Splitting the Content

Large documents need to be divided into manageable chunks for efficient retrieval. This process ensures that the system can handle queries effectively by focusing on smaller, relevant sections of data instead of scanning an entire document. For example, in legal document review or scientific research, chunking helps pinpoint specific information quickly, improving both speed and accuracy of the retrieval process. We use RecursiveCharacterTextSplitter to split the blog into smaller pieces:

from langchain.text_splitter import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(
    chunk_size=300,
    chunk_overlap=50
)
splits = text_splitter.split_documents(blog_docs)

Indexing with Embeddings

The document chunks are converted into vector embeddings using OpenAI’s embedding model and stored in a vector database (Chroma):

from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()


Step 2: Retrieval

The retriever enables the search functionality for fetching the most relevant chunks of content based on a query. For example, if you ask, ‘What are the key components of an AI agent?’, the retriever identifies and retrieves the most pertinent section from the indexed blog, ensuring precise and contextually relevant results. You can customize retrieval behavior by setting parameters like the number of results (k):

retriever = vectorstore.as_retriever(search_kwargs={"k": 1})

Step 3: Generating Responses

With the retriever in place, we now configure a language model to generate responses based on the retrieved context.

Setting Up the Prompt

The prompt defines how the model should format and generate the response:

from langchain.prompts import ChatPromptTemplate

template = """Answer the question based only on the following context:
{context}

Question: {question}
"""

prompt = ChatPromptTemplate.from_template(template)

Configuring the Language Model

We use OpenAI’s GPT-3.5-turbo model to handle the generation task. The temperature is set to 0 for deterministic outputs:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

Step 4: Building the RAG Chain

LangChain provides a modular pipeline for combining retrieval and generation steps into a unified chain:

from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

rag_chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

Step 5: Querying the Application

Finally, invoke the RAG chain with your question and get a precise answer:

response = rag_chain.invoke("What is the difference between Self Reflection and Task Composition?")
print(response)

Conclusion

By following the steps outlined above, you can build a powerful RAG application capable of answering questions based on indexed content. The combination of LangChain’s modularity, OpenAI’s embeddings, and Chroma’s vector store makes the process seamless.

Start experimenting today and expand your application’s capabilities by integrating additional datasets, refining prompts, or enhancing retrieval strategies.

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Recent Posts

Building an OpenAI Chatbot with LangChain

Have you ever wondered how to use OpenAI APIs to create custom chatbots? With advancements…

2 minutes ago

How Indexing Works in LLM-Based RAG Applications

When building a Retrieval-Augmented Generation (RAG) application powered by Large Language Models (LLMs), which combine…

4 days ago

Retrieval Augmented Generation (RAG) & LLM: Examples

Last updated: 25th Jan, 2025 Have you ever wondered how to seamlessly integrate the vast…

4 days ago

What are AI Agents? How do they work?

Artificial Intelligence (AI) agents have started becoming an integral part of our lives. Imagine asking…

3 weeks ago

Agentic AI Design Patterns Examples

In the ever-evolving landscape of agentic AI workflows and applications, understanding and leveraging design patterns…

3 weeks ago

List of Agentic AI Resources, Papers, Courses

In this blog, I aim to provide a comprehensive list of valuable resources for learning…

4 weeks ago