Creating a RAG Application Using LangGraph: Example Code

building rag application using Langgraph

Retrieval-Augmented Generation (RAG) is an innovative generative AI method that combines retrieval-based search with large language models (LLMs) to enhance response accuracy and contextual relevance. Unlike traditional retrieval systems that return existing documents or generative models that rely solely on pre-trained knowledge, RAG technique dynamically integrates context as retrieved information related to query with LLM outputs. LangGraph, an advanced extension of LangChain, provides a structured workflow for developing RAG applications. This guide will walk through the process of building a RAG system using LangGraph with example implementations.

Table of Contents

Setting Up the Environment

To get started, we need to install the necessary dependencies. The following commands will ensure that all required LangChain and Langgraph packages are available:

!pip install langchain-openapi langchain-community langchain-text-splitters langgraph --quiet --upgrade
!pip install langchain_openai

Next, configure environment variables to enable seamless API interactions. We set up LangSmith key and tracing for audit and OpenAI API key to integrate with OpenAI chat models.

import os

os.environ["LANGSMITH_TRACING"] = "true"
os.environ["LANGSMITH_API_KEY"] = 'lsv2_pt-xxx'
os.environ["OPENAI_API_KEY"] = 'sk-proj-xxxx'

Initializing the Language Model

A core component of the RAG system is the LLM, which generates responses by integrating context based on retrieving contextual information related to user question. Here, we initialize OpenAI’s GPT-4o-mini model:

from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini")

Selecting an Embedding Model and Vector Store

To efficiently retrieve relevant information, we use an embedding model to convert text into vector representations. An embedding model maps words, sentences, or entire documents into high-dimensional numerical vectors, capturing semantic relationships between them. This transformation enables efficient similarity searches, allowing the system to retrieve contextually relevant information based on query inputs. OpenAI provides several powerful embedding models, including text-embedding-3-large, text-embedding-3-small, and text-embedding-ada-002, each designed for different performance and efficiency trade-offs. text-embedding-3-large offers high accuracy and deeper semantic understanding, making it suitable for complex retrieval tasks, while text-embedding-3-small is optimized for lower computational costs with good performance. text-embedding-ada-002, a widely used earlier model, balances performance and efficiency for various natural language processing tasks.

The following code can be used for incorporate embedding model to convert text into vector representations. These vectors are stored in an in-memory vector database for quick lookup:

from langchain_openai import OpenAIEmbeddings
from langchain_core.vectorstores import InMemoryVectorStore

embedding_model = OpenAIEmbeddings(model="text-embedding-3-large")
vectorstore = InMemoryVectorStore(embedding_model)

Collecting Data for the Knowledge Base

To enable retrieval, we need a dataset. In this example, we extract content from publicly available news articles related to digital arrest scams. This can be built into a chat application which helps users interact and validate whether they are becoming victim of digital arrest scam. The idea is to use content from upcoming news and create embeddings and store them in a vector store. When user sends the query, appropriate piece of information is retrieved from the vectorstore matching the query, and the context and query can be passed as prompt to LLM to generate the answer.

import requests
from bs4 import BeautifulSoup
import json

urls = [
    "https://timesofindia.indiatimes.com/city/vijayawada/seven-cybercons-arrested-for-digital-arrest-scam-in-andhra-pradesh/articleshow/117649089.cms",
    "https://timesofindia.indiatimes.com/city/bengaluru/digital-arrest-fraud-elderly-law-professor-in-bengaluru-duped-of-rs-7-lakh/articleshow/117751902.cms"
]

documents = []
for url in urls:
    try:
        response = requests.get(url)
        if response.status_code == 200:
            soup = BeautifulSoup(response.text, 'html.parser')
            script_tags = soup.find_all('script', {'type': 'application/ld+json'})
            for script in script_tags:
                try:
                    json_data = json.loads(script.string)
                    if "articleBody" in json_data:
                        documents.append(json_data["articleBody"])
                        break
                except (json.JSONDecodeError, TypeError):
                    continue
        else:
            print(f"Failed to fetch {url}, Status code: {response.status_code}")
    except requests.RequestException as e:
        print(f"Error fetching {url}: {e}")

Processing Documents and Storing Vectors

To optimize retrieval, we split lengthy documents into smaller segments before storing them in the vector database. This process improves search precision and contextual matching:

from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter

docs = [Document(page_content=article) for article in documents]
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
all_splits = text_splitter.split_documents(docs)
vectorstore.add_documents(documents=all_splits)

Designing the RAG Workflow

Setting Up the Prompt

The code initializes a predefined prompt template from LangChain’s hub to guide the response generation process. By pulling the “rlm/rag-prompt” template, it ensures that the language model follows a structured format when generating answers.

from langchain import hub
prompt = hub.pull("rlm/rag-prompt")

This is how the rag prompt look like:

You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.
Question: {question} 
Context: {context} 
Answer:

Defining the System State

The application state in this RAG workflow is defined using a TypedDict named State, which organizes data at different stages of processing. It includes the user’s input question, a list of retrieved documents providing contextual information, and the generated response. This structured representation ensures smooth data flow between retrieval and generation steps, enabling efficient knowledge retrieval and response synthesis within the LangGraph framework.

The RAG system requires a structured state that tracks user queries, retrieved contexts, and generated responses:

from langgraph.graph import START, StateGraph
from typing_extensions import List, TypedDict

class State(TypedDict):
    question: str
    context: List[Document]
    answer: str

Implementing Retrieval and Response Generation

The retrieval function fetches relevant content from the vector database, while the generation function synthesizes an AI-driven response:

def retrieve(state: State):
    retrieved_docs = vectorstore.similarity_search(state["question"])
    return {"context": retrieved_docs}

def generate(state: State):
    docs_content = "\n\n".join(doc.page_content for doc in state["context"])
    messages = prompt.invoke({"question": state["question"], "context": docs_content})
    response = llm.invoke(messages)
    return {"answer": response.content}

Building and Running the Workflow

The LangGraph workflow connects retrieval and generation in a structured manner to ensure seamless execution:

graph_builder = StateGraph(State).add_sequence([retrieve, generate])
graph_builder.add_edge(START, "retrieve")
graph = graph_builder.compile()

Testing the system with an example query:

response = graph.invoke({"question": "List three different scenarios of digital arrest?"})
print(response["answer"])

Conclusion

This guide provided a step-by-step approach to building a RAG system using LangGraph and LangChain. By integrating retrieval with AI-generated responses, we created a structured knowledge retrieval system that can process queries with improved accuracy.

For further learning, explore the official documentation for LangChain and LangGraph. Engaging with research papers on Retrieval-Augmented Generation and participating in community forums will provide deeper insights into this evolving field.

Author
Recent Posts

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. I would love to connect with you on Linkedin.
Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking.

Creating a RAG Application Using LangGraph: Example Code

Setting Up the Environment

Initializing the Language Model

Selecting an Embedding Model and Vector Store

Collecting Data for the Knowledge Base

Processing Documents and Storing Vectors

Designing the RAG Workflow

Setting Up the Prompt

Defining the System State

Implementing Retrieval and Response Generation

Building and Running the Workflow

Conclusion

Ajitesh Kumar

ChatGPT Prompts (250+)

Recent Posts

Data Science / AI Trends

Free Online Tools

Newsletter

Recent Comments

Creating a RAG Application Using LangGraph: Example Code

Setting Up the Environment

Initializing the Language Model

Selecting an Embedding Model and Vector Store

Collecting Data for the Knowledge Base

Processing Documents and Storing Vectors

Designing the RAG Workflow

Setting Up the Prompt

Defining the System State

Implementing Retrieval and Response Generation

Building and Running the Workflow

Conclusion

Ajitesh Kumar

ChatGPT Prompts (250+)

Recent Posts

Data Science / AI Trends

Free Online Tools

Newsletter

Tag Cloud

Recent Comments