Welcome to this comprehensive guide on constructing an AI-enhanced Chat Application. We will create a dynamic chat interface capable of delving into PDF documents to extract and provide summaries, key facts, and answers to your queries. By the end of this tutorial, you’ll have a powerful tool at your disposal, transforming the way you interact with and utilize the information contained within PDFs.
This tutorial will demonstrate how to -
Note that this tutorial is designed to work with the latest Node SDK version (4.3.0+) for Couchbase. It will not work with the older Node.js versions.
Vector Search in Couchbase is only supported at Couchbase Version 7.6+.
git clone https://github.com/couchbase-examples/vector-search-nodejs.git
Any dependencies will be installed by running the npm install command from the root directory of the project.
npm install
To know more about connecting to your Capella cluster, please follow the instructions.
Specifically, you need to do the following:
pdf-chat
. We will use the _default
scope and _default
collection of this bucket.We need to create the Search Index on the Full Text Service in Couchbase. For this demo, you can import the following index using the instructions.
You may also create a vector index using Search UI on both Couchbase Capella and Couchbase Self Managed Server
Here, we are creating the index pdf_search
on the documents. The Vector field is set to embedding
with 1536 dimensions and the text field set to text
. We are also indexing and storing all the fields under metadata
in the document as a dynamic mapping to account for varying document structures. The similarity metric is set to dot_product
. If there is a change in these parameters, please adapt the index accordingly.
{
"name": "pdf_search",
"type": "fulltext-index",
"params": {
"doc_config": {
"docid_prefix_delim": "",
"docid_regexp": "",
"mode": "scope.collection.type_field",
"type_field": "type"
},
"mapping": {
"default_analyzer": "standard",
"default_datetime_parser": "dateTimeOptional",
"default_field": "_all",
"default_mapping": {
"dynamic": true,
"enabled": false
},
"default_type": "_default",
"docvalues_dynamic": false,
"index_dynamic": true,
"store_dynamic": false,
"type_field": "_type",
"types": {
"_default._default": {
"dynamic": true,
"enabled": true,
"properties": {
"embedding": {
"enabled": true,
"dynamic": false,
"fields": [
{
"dims": 1536,
"index": true,
"name": "embedding",
"similarity": "dot_product",
"type": "vector",
"vector_index_optimized_for": "recall"
}
]
},
"metadata": {
"dynamic": true,
"enabled": true
},
"text": {
"enabled": true,
"dynamic": false,
"fields": [
{
"index": true,
"name": "text",
"store": true,
"type": "text"
}
]
}
}
}
}
},
"store": {
"indexType": "scorch",
"segmentVersion": 16
}
},
"sourceType": "gocbcore",
"sourceName": "pdf-docs",
"sourceParams": {},
"planParams": {
"maxPartitionsPerPIndex": 64,
"indexPartitions": 16,
"numReplicas": 0
}
}
Copy the .env.template
file from the repo and rename it to .env
(.env.local
in case of local development) and replace the placeholders with the actual values for your environment.
All configuration for communication with the database is read from the environment variables.
OPENAI_API_KEY=<open_ai_api_key>
DB_CONN_STR=<connection_string_for_couchbase_cluster>
DB_USERNAME=<username_for_couchbase_cluster>
DB_PASSWORD=<password_for_couchbase_cluster>
DB_BUCKET=<name_of_bucket_to_store_documents>
DB_SCOPE=<name_of_scope_to_store_documents>
DB_COLLECTION=<name_of_collection_to_store_documents>
INDEX_NAME=<name_of_fts_index_with_vector_support>
OpenAI API Key is required for usage in generating embedding and querying LLM
The connection string expects the
couchbases://
orcouchbase://
part.
For this tutorial,
DB_BUCKET=pdf-chat
,DB_SCOPE=_default
,DB_COLLECTION=_default
andINDEX_NAME=pdf_search
After starting couchbase server, adding vector index and installing dependencies. Our Application is ready to run.
In the projects root directory, run the following command
npm run dev
The application will run on your local machine at http://localhost:3000.
At the home page, there will be option to upload the PDF. Drag and Drop the PDF file in the upload box or Click to browse. Click on Upload and the process of conversion starts. This step may take some time depending on the size of uploaded PDF.
After uploading, a new screen appears which shows the uploaded PDF at the left and chat interface on the right side. Use Chat to ask questions regarding the PDF and get answers.
The PDF Chat application leverages two powerful concepts: Retrieval-Augmented Generation (RAG) and Vector Search. Together, these techniques enable efficient and context-aware interactions with PDF documents.
RAG is like having two helpers:
Here's how RAG works:
This enhances the context from PDF and LLM is able to give relevant results from the PDF rather than giving generalized results.
Couchbase is a NoSQL database that provides a powerful Vector Search capability. It allows you to store and search through high-dimensional vector representations (embeddings) of textual data, such as PDF content.
The PDF Chat app uses LangChain to convert the text from the PDF documents into embeddings. These embeddings are then stored in a Couchbase bucket, along with the corresponding text.
When a user asks a question or provides a prompt:
LangChain is a powerful library that simplifies the process of building applications with large language models (LLMs) and vector stores like Couchbase.
In the PDF Chat app, LangChain is used for several tasks:
By combining Vector Search with Couchbase, RAG, and LangChain; the PDF Chat app can efficiently ingest PDF documents, convert their content into searchable embeddings, retrieve relevant information based on user queries and conversation context, and generate context-aware and informative responses using large language models. This approach provides users with a powerful and intuitive way to explore and interact with large PDF files.
To begin this tutorial, clone the repo and open it up in the IDE of your choice. Now you can learn how to create the PDF Chat App.
app
├── api
│ ├── chat
│ │ └── route.ts
│ └── ingestPdf
│ └── route.ts
├── chatPage
│ ├── layout.tsx
│ └── page.tsx
├── layout.tsx
├── page.tsx
components
├── Header.tsx
├── InfoCard.tsx
├── Loader.tsx
├── LoadingDots.tsx
├── PDFUploader.tsx
└── Sidebar.tsx
styles
├── global.css
└── loading-dots.module.css
lib
└── couchbase-connection.tsx
The APIs are written in /app/api
folder. We have 2 APIs, one for ingesting the newly uploaded PDF into the couchbase vector store and the other for chat to get user message and return message response from AI assistant.
We also have a couchbase connection function written at /lib/couchbase-connection.tsx
which provides connection to couchbase support.
The user interface is written as 2 pages. The Home Page (/page.tsx
) and the Chat Page (/chatPage/page.tsx
). The additional components required by the UI are written in /components
and are used to offer functionalities like loader, Sidebar, PDF Uploader etc.
The styles are defined at /styles
folder with global.css
as the global CSS file and /loading-dots.module.css
module for showing loading dots in chat.
The fundamental workflow of the application is as follows: The user initiates the process from the Home Page by uploading a PDF. This action triggers the ingestPDF API route, which subsequently uploads the PDF into the Couchbase vector store. Following this, the Chat Page is presented to the user.
On the Chat Page, the user can pose questions. These inquiries are processed by the Chat API, which consults the LLM for responses, aided by the context provided by RAG. The assistant then delivers the answer, and the user has the option to ask additional questions.
The first step will be connecting to couchbase. Couchbase Vector Search is required for PDF Upload as well as during chat (For Retrieval). We will use the Couchbase Node.js SDK to connect to the Couchbase cluster. The connection is established in the couchbase-connection.ts
file in the lib
folder.
The connection string and credentials are read from the environment variables. We perform some basic required checks for the environment variable not being set in the .env
, and then proceed to connect to the couchbase cluster. We connect to the cluster using connect method.
import { connect, Cluster } from "couchbase";
export async function createCouchbaseCluster(): Promise<Cluster> {
const connectionString = process.env.DB_CONN_STR;
const databaseUsername = process.env.DB_USERNAME;
const databasePassword = process.env.DB_PASSWORD;
if (!databaseUsername) {
throw new Error(
"Please define the DB_USERNAME environment variable inside .env"
);
}
if (!databasePassword) {
throw new Error(
"Please define the DB_PASSWORD environment variable inside .env"
);
}
if (!connectionString) {
throw new Error(
"Please define the DB_CONN_STR environment variable inside .env"
);
}
const cluster = await connect(connectionString, {
username: databaseUsername,
password: databasePassword,
configProfile: "wanDevelopment",
});
return cluster;
}
This API at /app/api/ingestPdf/route.ts
allows you to upload a PDF file, split its text into chunks, generate embeddings for those chunks, and ingest the chunks and their embeddings into a Couchbase vector store. Let's go step by step on what it does.
The code extracts the uploaded file from the request's form data and converts the file to an Array Buffer and then to a Node.js Buffer.
// Get file from form data
const data = await request.formData();
const file: File | null = data.get("file") as unknown as File;
// Read file to a Node.js Buffer
const bytes = await file.arrayBuffer();
const buffer = Buffer.from(bytes);
It then creates a PDFLoader instance from the LangChain library and initializes it with the binary data of the uploaded PDF file. We then load the PDF Files to LangChain Document type.
import { PDFLoader } from "langchain/document_loaders/fs/pdf";
const loader = new PDFLoader(new Blob([buffer]));
const rawDocs = await loader.load();
This LangChain document array will contain huge individual files which defeats the purpose while retrieval as we want to send more relevant context to LLM. So we will split it into smaller chunks or passages using LangChain's RecursiveCharacterTextSplitter:
At the end splitDocuments method splits the large document into smaller LangChain documents based on above defined parameters.
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
});
const docs = await textSplitter.splitDocuments(rawDocs);
We will now initialize OpenAI embeddings which will be used by CouchbaseVectorStore for converting the split docs defined above to vectors (embeddings).
We will also initialize couchbase vector store with couchbase bucket info. Firstly we will connect to couchbase cluster usingcreateCouchbaseCluster
.
We will define the bucket, scope, collection and index names from Environment Variables. We will also define search related parameter:
textKey
which is the key of where original text will be stored in couchbase document.embeddingKey
which is the key of where embedded text (vectors) will be stored in couchbase document.scopedIndex
defines whether to use scoped index or cluster index. We recommend using scoped index, and we will use it for this tutorial as well.const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
});
const bucketName = process.env.DB_BUCKET || "";
const scopeName = process.env.DB_SCOPE || "";
const collectionName = process.env.DB_COLLECTION || "";
const indexName = process.env.INDEX_NAME || "";
const textKey = "text";
const embeddingKey = "embedding";
const scopedIndex = true;
const cluster = await createCouchbaseCluster();
const couchbaseConfig: CouchbaseVectorStoreArgs = {
cluster,
bucketName,
scopeName,
collectionName,
indexName,
textKey,
embeddingKey,
scopedIndex,
};
With everything ready for initializing Vector store, we create it using CouchbaseVectorStore.fromDocuments function in LangChain. This function requires the documents which user wants to upload, details of couchbase vector store and an embeddings client which will create text to vector (embeddings).
await CouchbaseVectorStore.fromDocuments(docs, embeddings, couchbaseConfig);
It also saves the PDF file to the public/assets/ directory for using in viewing the PDF file later in chat page.
await writeFile(path.join(process.cwd(), "public/assets/" + file.name), buffer);
After all the above steps, The PDF uploading part is complete, and now we can use this PDF data during our chats. We return the NextResponse back to UI so that we can switch to the Chat Page.
return NextResponse.json({
text: "Successfully embedded pdf",
fileName: file.name,
});
This API at /app/api/chat/route.ts
is used when user asks a question from the assistant. The assistant (LLM) is called here with RAG context, the response from the assistant is sent back to the user.
The user will type a message and the message will be sent to the chat API. The current message and other history messages are segregated.
All the message history is formatted using formatVercelMessages Function. This function takes a VercelChatMessage object and converts it into a HumanMessage or AIMessage object, which are classes from the LangChain library used to represent conversation messages.
At the end we get all the previous messages in chat formatted form and current question (message) from user
const formatVercelMessages = (message: VercelChatMessage) => {
if (message.role === "user") {
return new HumanMessage(message.content);
} else if (message.role === "assistant") {
return new AIMessage(message.content);
} else {
console.warn(
`Unknown message type passed: "${message.role}". Falling back to generic message type.`
);
return new ChatMessage({ content: message.content, role: message.role });
}
};
const body = await request.json();
const messages = body.messages ?? [];
if (!messages.length) {
throw new Error("No messages provided.");
}
const formattedPreviousMessages = messages
.slice(0, -1)
.map(formatVercelMessages);
const currentMessageContent = messages[messages.length - 1].content;
We require OpenAI's embedding model and LLM model.
OpenAI Embeddings are vector representations of text generated by OpenAI's language models. In this API, the OpenAIEmbeddings class from the LangChain library is used to generate embeddings for the documents stored in the Couchbase Vector Store.
The ChatOpenAI class from the LangChain library is used as the language model for generating responses. It is an interface to OpenAI's chat models, which are capable of understanding and generating human-like conversations.
const model = new ChatOpenAI({});
const embeddings = new OpenAIEmbeddings({
openAIApiKey: process.env.OPENAI_API_KEY,
});
Just like in PDF Ingest API, we will initialize the couchbase vector store here. Use the same details here as well of couchbase collection and index. Now we will initialize the store using CouchbaseVectorStore.initialize() method.
We also create the retriever of the couchbase vector store. This retriever will be used to retrieve the previously documents which are similar to current query. We also define a callback to get document details as well.
const bucketName = process.env.DB_BUCKET || "";
const scopeName = process.env.DB_SCOPE || "";
const collectionName = process.env.DB_COLLECTION || "";
const indexName = process.env.INDEX_NAME || "";
const textKey = "text";
const embeddingKey = "embedding";
const scopedIndex = true;
const cluster = await createCouchbaseCluster();
const couchbaseConfig: CouchbaseVectorStoreArgs = {
cluster,
bucketName,
scopeName,
collectionName,
indexName,
textKey,
embeddingKey,
scopedIndex,
};
const couchbaseVectorStore = await CouchbaseVectorStore.initialize(
embeddings,
couchbaseConfig
);
const retriever = couchbaseVectorStore.asRetriever({
callbacks: [
{
handleRetrieverEnd(documents) {
resolveWithDocuments(documents);
},
},
],
});
We will now utilize the power of LangChain Chains using the LangChain Expression Language (LCEL). LCEL makes it easy to build complex chains from basic components, and supports out of the box functionality such as streaming, parallelism, and logging.
LCEL is a domain-specific language that provides several key advantages when working with LangChain:
In summary, LCEL streamlines the process of building sophisticated natural language processing applications by providing a composable, readable, reusable, extensible, type-safe, and abstracted way to define and orchestrate LangChain components into complex workflows.
We will be using LCEL chains in next few sections and will see how LCEL optimizes our whole workflow.
The historyAwarePrompt is used to generate a query for the vector store based on the conversation history and the current user message.
The historyAwarePrompt is a ChatPromptTemplate from the LangChain library. It is defined to include the conversation history, the current user message, and a prompt asking for a concise vector store search query.
const historyAwarePrompt = ChatPromptTemplate.fromMessages([
new MessagesPlaceholder("chat_history"),
["user", "{input}"],
[
"user",
"Given the above conversation, generate a concise vector store search query to look up in order to get information relevant to the conversation.",
],
]);
The historyAwareRetrieverChain is created using the createHistoryAwareRetriever function from the LangChain library. It takes the historyAwarePrompt, the language model (ChatOpenAI instance), and the vector store retriever as input.
The historyAwareRetrieverChain is responsible for generating a query based on the conversation history and retrieving relevant documents from the vector store.
const historyAwareRetrieverChain = await createHistoryAwareRetriever({
llm: model,
retriever,
rephrasePrompt: historyAwarePrompt,
});
The documentChain is created using the createStuffDocumentsChain function from the LangChain library. It takes the language model (ChatOpenAI instance) and a prompt (answerPrompt) as input.
The answerPrompt is a ChatPromptTemplate that includes instructions for the language model to generate an answer based on the provided context (retrieved documents) and the user's question.
const ANSWER_SYSTEM_TEMPLATE = `You are a helpful AI assistant. Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say you don't know. DO NOT try to make up an answer.
If the question is not related to the context, politely respond that you are tuned to only answer questions that are related to the context.
<context>
{context}
</context>
Please return your answer in markdown with clear headings and lists.`;
const answerPrompt = ChatPromptTemplate.fromMessages([
["system", ANSWER_SYSTEM_TEMPLATE],
new MessagesPlaceholder("chat_history"),
["user", "{input}"],
]);
const documentChain = await createStuffDocumentsChain({
llm: model,
prompt: answerPrompt,
});
The conversationalRetrievalChain combines the historyAwareRetrieverChain and the documentChain.
const conversationalRetrievalChain = await createRetrievalChain({
retriever: historyAwareRetrieverChain,
combineDocsChain: documentChain,
});
const outputChain = RunnableSequence.from([
conversationalRetrievalChain,
new RunnablePick({ keys: "answer" }),
new HttpResponseOutputParser({ contentType: "text/plain" }),
]);
The response is streamed back to the client, along with metadata about the message index and the retrieved document sources.
const stream = await outputChain.stream({
chat_history: formattedPreviousMessages,
input: currentMessageContent,
});
const documents = await documentPromise;
const serializedSources = Buffer.from(
JSON.stringify(
documents.map((doc) => {
return {
pageContent: doc.pageContent.slice(0, 50) + "...",
metadata: doc.metadata,
};
})
)
).toString("base64");
return new StreamingTextResponse(stream, {
headers: {
"x-message-index": (formattedPreviousMessages.length + 1).toString(),
"x-sources": serializedSources,
},
});