Retrieval-Augmented Generation (RAG) with Couchbase and smolagents

Learn how to build a semantic search engine using Couchbase and Hugging Face smolagents.
This tutorial demonstrates how to integrate Couchbase's vector search capabilities with smolagents using tool calling.
You'll understand how to perform Retrieval-Augmented Generation (RAG) using smolagents and Couchbase.

Introduction

In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, OpenAI as the embedding and LLM provider, and Hugging Face smolagents as an agent framework. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.

How to run this tutorial

This tutorial is available as a Jupyter Notebook (.ipynb file) that you can run interactively.

You can either download the notebook file and run it on Google Colab or run it on your system by setting up the Python environment.

Before you start

Get Credentials for OpenAI

Please follow the instructions to generate the OpenAI credentials.

Create and Deploy Your Free Tier Operational cluster on Capella

To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with an environment where you can explore and learn about Capella with no time constraint.

To learn more, please follow the instructions.

Couchbase Capella Configuration

When running Couchbase using Capella, the following prerequisites need to be met.

Create the database credentials to access the required bucket (Read and Write) used in the application.
Allow access to the Cluster from the IP on which the application is running.

Setting the Stage: Installing Necessary Libraries

To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and OpenAI provides advanced AI models for generating embeddings and understanding natural language. By setting up these libraries, we ensure our environment is equipped to handle the data-intensive and computationally complex tasks required for semantic search.

%pip install --quiet -U datasets==3.5.0 langchain-couchbase==0.3.0 langchain-openai==0.3.13 python-dotenv==1.1.0 smolagents==1.13.0 ipywidgets==8.1.6

Importing Necessary Libraries

The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading. These libraries provide essential functions for working with data, managing database connections, and processing machine learning models.

import getpass
import json
import logging
import os
import time
from datetime import timedelta

from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.exceptions import (InternalServerFailureException,
                                  ServiceUnavailableException,
                                  QueryIndexAlreadyExistsException)
from couchbase.management.buckets import CreateBucketSettings
from couchbase.management.search import SearchIndex
from couchbase.options import ClusterOptions
from datasets import load_dataset
from dotenv import load_dotenv
from langchain_couchbase.vectorstores import CouchbaseSearchVectorStore
from langchain_openai import OpenAIEmbeddings

from smolagents import Tool, OpenAIServerModel, ToolCallingAgent

Setup Logging

Logging is configured to track the progress of the script and capture any errors or warnings. This is crucial for debugging and understanding the flow of execution. The logging output includes timestamps, log levels (e.g., INFO, ERROR), and messages that describe what is happening in the script.

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', force=True)

Loading Sensitive Information

In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like API keys, database credentials, and specific configuration names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.

The script also validates that all required inputs are provided, raising an error if any crucial information is missing. This approach ensures that your integration is both secure and correctly configured without hardcoding sensitive information, enhancing the overall security and maintainability of your code.

load_dotenv()

OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') or getpass.getpass('Enter your OpenAI API Key: ')

CB_HOST = os.getenv('CB_HOST') or input('Enter your Couchbase host (default: couchbase://localhost): ') or 'couchbase://localhost'
CB_USERNAME = os.getenv('CB_USERNAME') or input('Enter your Couchbase username (default: Administrator): ') or 'Administrator'
CB_PASSWORD = os.getenv('CB_PASSWORD') or getpass.getpass('Enter your Couchbase password (default: password): ') or 'password'
CB_BUCKET_NAME = os.getenv('CB_BUCKET_NAME') or input('Enter your Couchbase bucket name (default: vector-search-testing): ') or 'vector-search-testing'
INDEX_NAME = os.getenv('INDEX_NAME') or input('Enter your index name (default: vector_search_smolagents): ') or 'vector_search_smolagents'
SCOPE_NAME = os.getenv('SCOPE_NAME') or input('Enter your scope name (default: shared): ') or 'shared'
COLLECTION_NAME = os.getenv('COLLECTION_NAME') or input('Enter your collection name (default: smolagents): ') or 'smolagents'

# Check if the variables are correctly loaded
if not OPENAI_API_KEY:
    raise ValueError("Missing OpenAI API Key")

if 'OPENAI_API_KEY' not in os.environ:
    os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY

Connecting to the Couchbase Cluster

Connecting to a Couchbase cluster is the foundation of our project. Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our semantic search engine. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections. This connection is the gateway through which all data will flow, so ensuring it's set up correctly is paramount.

try:
    auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)
    options = ClusterOptions(auth)
    cluster = Cluster(CB_HOST, options)
    cluster.wait_until_ready(timedelta(seconds=5))
    logging.info("Successfully connected to Couchbase")
except Exception as e:
    raise ConnectionError(f"Failed to connect to Couchbase: {str(e)}")

2025-02-28 10:30:17,515 - INFO - Successfully connected to Couchbase

Setting Up Collections in Couchbase

The setup_collection() function handles creating and configuring the hierarchical data organization in Couchbase:

Bucket Creation:
- Checks if specified bucket exists, creates it if not
- Sets bucket properties like RAM quota (1024MB) and replication (disabled)
- Note: You will not be able to create a bucket on Capella
Scope Management:
- Verifies if requested scope exists within bucket
- Creates new scope if needed (unless it's the default "_default" scope)
Collection Setup:
- Checks for collection existence within scope
- Creates collection if it doesn't exist
- Waits 2 seconds for collection to be ready

Additional Tasks:

Creates primary index on collection for query performance
Clears any existing documents for clean state
Implements comprehensive error handling and logging

The function is called twice to set up:

Main collection for vector embeddings
Cache collection for storing results

def setup_collection(cluster, bucket_name, scope_name, collection_name):
    try:
        # Check if bucket exists, create if it doesn't
        try:
            bucket = cluster.bucket(bucket_name)
            logging.info(f"Bucket '{bucket_name}' exists.")
        except Exception as e:
            logging.info(f"Bucket '{bucket_name}' does not exist. Creating it...")
            bucket_settings = CreateBucketSettings(
                name=bucket_name,
                bucket_type='couchbase',
                ram_quota_mb=1024,
                flush_enabled=True,
                num_replicas=0
            )
            cluster.buckets().create_bucket(bucket_settings)
            time.sleep(2)  # Wait for bucket creation to complete and become available
            bucket = cluster.bucket(bucket_name)
            logging.info(f"Bucket '{bucket_name}' created successfully.")

        bucket_manager = bucket.collections()

        # Check if scope exists, create if it doesn't
        scopes = bucket_manager.get_all_scopes()
        scope_exists = any(scope.name == scope_name for scope in scopes)
        
        if not scope_exists and scope_name != "_default":
            logging.info(f"Scope '{scope_name}' does not exist. Creating it...")
            bucket_manager.create_scope(scope_name)
            logging.info(f"Scope '{scope_name}' created successfully.")

        # Check if collection exists, create if it doesn't
        collections = bucket_manager.get_all_scopes()
        collection_exists = any(
            scope.name == scope_name and collection_name in [col.name for col in scope.collections]
            for scope in collections
        )

        if not collection_exists:
            logging.info(f"Collection '{collection_name}' does not exist. Creating it...")
            bucket_manager.create_collection(scope_name, collection_name)
            logging.info(f"Collection '{collection_name}' created successfully.")
        else:
            logging.info(f"Collection '{collection_name}' already exists. Skipping creation.")

        # Wait for collection to be ready
        collection = bucket.scope(scope_name).collection(collection_name)
        time.sleep(2)  # Give the collection time to be ready for queries

        # Ensure primary index exists
        try:
            cluster.query(f"CREATE PRIMARY INDEX IF NOT EXISTS ON `{bucket_name}`.`{scope_name}`.`{collection_name}`").execute()
            logging.info("Primary index present or created successfully.")
        except Exception as e:
            logging.warning(f"Error creating primary index: {str(e)}")

        # Clear all documents in the collection
        try:
            query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
            cluster.query(query).execute()
            logging.info("All documents cleared from the collection.")
        except Exception as e:
            logging.warning(f"Error while clearing documents: {str(e)}. The collection might be empty.")

        return collection
    except Exception as e:
        raise RuntimeError(f"Error setting up collection: {str(e)}")
    
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)

2025-02-28 10:30:20,855 - INFO - Bucket 'vector-search-testing' exists.
2025-02-28 10:30:21,350 - INFO - Collection 'smolagents' does not exist. Creating it...
2025-02-28 10:30:21,619 - INFO - Collection 'smolagents' created successfully.
2025-02-28 10:30:26,886 - INFO - Primary index present or created successfully.
2025-02-28 10:30:26,938 - INFO - All documents cleared from the collection.





<couchbase.collection.Collection at 0x10714e610>

Loading Couchbase Vector Search Index

Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase Vector Search Index comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.

This vector search index configuration requires specific default settings to function properly. This tutorial uses the bucket named vector-search-testing with the scope shared and collection smolagents. The configuration is set up for vectors with exactly 1536 dimensions, using dot product similarity and optimized for recall. If you want to use a different bucket, scope, or collection, you will need to modify the index configuration accordingly.

For more information on creating a vector search index, please follow the instructions.

# If you are running this script locally (not in Google Colab), uncomment the following line
# and provide the path to your index definition file.

# index_definition_path = '/path_to_your_index_file/smolagents_index.json'  # Local setup: specify your file path here

# # Version for Google Colab
# def load_index_definition_colab():
#     from google.colab import files
#     print("Upload your index definition file")
#     uploaded = files.upload()
#     index_definition_path = list(uploaded.keys())[0]

#     try:
#         with open(index_definition_path, 'r') as file:
#             index_definition = json.load(file)
#         return index_definition
#     except Exception as e:
#         raise ValueError(f"Error loading index definition from {index_definition_path}: {str(e)}")

# Version for Local Environment
def load_index_definition_local(index_definition_path):
    try:
        with open(index_definition_path, 'r') as file:
            index_definition = json.load(file)
        return index_definition
    except Exception as e:
        raise ValueError(f"Error loading index definition from {index_definition_path}: {str(e)}")

# Usage
# Uncomment the appropriate line based on your environment
# index_definition = load_index_definition_colab()
index_definition = load_index_definition_local('smolagents_index.json')

Creating or Updating Search Indexes

With the index definition loaded, the next step is to create or update the Vector Search Index in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.

try:
    scope_index_manager = cluster.bucket(CB_BUCKET_NAME).scope(SCOPE_NAME).search_indexes()

    # Check if index already exists
    existing_indexes = scope_index_manager.get_all_indexes()
    index_name = index_definition["name"]

    if index_name in [index.name for index in existing_indexes]:
        logging.info(f"Index '{index_name}' found")
    else:
        logging.info(f"Creating new index '{index_name}'...")

    # Create SearchIndex object from JSON definition
    search_index = SearchIndex.from_json(index_definition)

    # Upsert the index (create if not exists, update if exists)
    scope_index_manager.upsert_index(search_index)
    logging.info(f"Index '{index_name}' successfully created/updated.")

except QueryIndexAlreadyExistsException:
    logging.info(f"Index '{index_name}' already exists. Skipping creation/update.")
except ServiceUnavailableException:
    raise RuntimeError("Search service is not available. Please ensure the Search service is enabled in your Couchbase cluster.")
except InternalServerFailureException as e:
    logging.error(f"Internal server error: {str(e)}")
    raise

2025-02-28 10:30:32,890 - INFO - Creating new index 'vector-search-testing.shared.vector_search_smolagents'...
2025-02-28 10:30:33,058 - INFO - Index 'vector-search-testing.shared.vector_search_smolagents' successfully created/updated.

Creating OpenAI Embeddings

Embeddings are at the heart of semantic search. They are numerical representations of text that capture the semantic meaning of the words and phrases. Unlike traditional keyword-based search, which looks for exact matches, embeddings allow our search engine to understand the context and nuances of language, enabling it to retrieve documents that are semantically similar to the query, even if they don't contain the exact keywords. By creating embeddings using OpenAI, we equip our search engine with the ability to understand and process natural language in a way that's much closer to how humans understand language. This step transforms our raw text data into a format that the search engine can use to find and rank relevant documents.

try:
    embeddings = OpenAIEmbeddings(
        model="text-embedding-3-small",
        api_key=OPENAI_API_KEY,
    )
    logging.info("Successfully created OpenAIEmbeddings")
except Exception as e:
    raise ValueError(f"Error creating OpenAIEmbeddings: {str(e)}")

2025-02-28 10:30:36,983 - INFO - Successfully created OpenAIEmbeddings

Setting Up the Couchbase Vector Store

A vector store is where we'll keep our embeddings. Unlike the FTS index, which is used for text-based search, the vector store is specifically designed to handle embeddings and perform similarity searches. When a user inputs a query, the search engine converts the query into an embedding and compares it against the embeddings stored in the vector store. This allows the engine to find documents that are semantically similar to the query, even if they don't contain the exact same words. By setting up the vector store in Couchbase, we create a powerful tool that enables our search engine to understand and retrieve information based on the meaning and context of the query, rather than just the specific words used.

try:
    vector_store = CouchbaseSearchVectorStore(
        cluster=cluster,
        bucket_name=CB_BUCKET_NAME,
        scope_name=SCOPE_NAME,
        collection_name=COLLECTION_NAME,
        embedding=embeddings,
        index_name=INDEX_NAME,
    )
    logging.info("Successfully created vector store")
except Exception as e:
    raise ValueError(f"Failed to create vector store: {str(e)}")

2025-02-28 10:30:40,503 - INFO - Successfully created vector store

Load the BBC News Dataset

To build a search engine, we need data to search through. We use the BBC News dataset from RealTimeData, which provides real-world news articles. This dataset contains news articles from BBC covering various topics and time periods. Loading the dataset is a crucial step because it provides the raw material that our search engine will work with. The quality and diversity of the news articles make it an excellent choice for testing and refining our search engine, ensuring it can handle real-world news content effectively.

The BBC News dataset allows us to work with authentic news articles, enabling us to build and test a search engine that can effectively process and retrieve relevant news content. The dataset is loaded using the Hugging Face datasets library, specifically accessing the "RealTimeData/bbc_news_alltime" dataset with the "2024-12" version.

try:
    news_dataset = load_dataset(
        "RealTimeData/bbc_news_alltime", "2024-12", split="train"
    )
    print(f"Loaded the BBC News dataset with {len(news_dataset)} rows")
    logging.info(f"Successfully loaded the BBC News dataset with {len(news_dataset)} rows.")
except Exception as e:
    raise ValueError(f"Error loading the BBC News dataset: {str(e)}")

README.md:   0%|          | 0.00/53.5k [00:00<?, ?B/s]



Generating train split:   0%|          | 0/2687 [00:00<?, ? examples/s]


2025-02-28 10:30:51,981 - INFO - Successfully loaded the BBC News dataset with 2687 rows.


Loaded the BBC News dataset with 2687 rows

Cleaning up the Data

We will use the content of the news articles for our RAG system.

The dataset contains a few duplicate records. We are removing them to avoid duplicate results in the retrieval stage of our RAG system.

news_articles = news_dataset["content"]
unique_articles = set()
for article in news_articles:
    if article:
        unique_articles.add(article)
unique_news_articles = list(unique_articles)
print(f"We have {len(unique_news_articles)} unique articles in our database.")

We have 1749 unique articles in our database.

Saving Data to the Vector Store

To efficiently handle the large number of articles, we process them in batches of articles at a time. This batch processing approach helps manage memory usage and provides better control over the ingestion process.

We first filter out any articles that exceed 50,000 characters to avoid potential issues with token limits. Then, using the vector store's add_texts method, we add the filtered articles to our vector database. The batch_size parameter controls how many articles are processed in each iteration.

This approach offers several benefits:

Memory Efficiency: Processing in smaller batches prevents memory overload
Error Handling: If an error occurs, only the current batch is affected
Progress Tracking: Easier to monitor and track the ingestion progress
Resource Management: Better control over CPU and network resource utilization

We use a conservative batch size of 100 to ensure reliable operation. The optimal batch size depends on many factors including:

Document sizes being inserted
Available system resources
Network conditions
Concurrent workload

Consider measuring performance with your specific workload before adjusting.

# Save the current logging level
current_logging_level = logging.getLogger().getEffectiveLevel()

# # Set logging level to CRITICAL to suppress lower level logs
logging.getLogger().setLevel(logging.CRITICAL)

articles = [article for article in unique_news_articles if article and len(article) <= 50000]

try:
    vector_store.add_texts(
        texts=articles,
        batch_size=100
    )
except Exception as e:
    raise ValueError(f"Failed to save documents to vector store: {str(e)}")

# Restore the original logging level
logging.getLogger().setLevel(current_logging_level)

smolagents: An Introduction

smolagents is a agentic framework by Hugging Face for easy creation of agents in a few lines of code.

Some of the features of smolagents are:

✨ Simplicity: the logic for agents fits in ~1,000 lines of code (see agents.py). We kept abstractions to their minimal shape above raw code!
🧑‍💻 First-class support for Code Agents. Our CodeAgent writes its actions in code (as opposed to "agents being used to write code"). To make it secure, we support executing in sandboxed environments via E2B.
🤗 Hub integrations: you can share/pull tools to/from the Hub, and more is to come!
🌐 Model-agnostic: smolagents supports any LLM. It can be a local transformers or ollama model, one of many providers on the Hub, or any model from OpenAI, Anthropic and many others via our LiteLLM integration.
👁️ Modality-agnostic: Agents support text, vision, video, even audio inputs! Cf this tutorial for vision.
🛠️ Tool-agnostic: you can use tools from LangChain, Anthropic's MCP, you can even use a Hub Space as a tool.

Building a RAG Agent using smolagents

smolagents allows users to define their own tools for the agent to use. These tools can be of two types:

Tools defined as classes: These tools are subclassed from the Tool class and must override the forward method, which is called when the tool is used.
Tools defined as functions: These are simple functions that are called when the tool is used, and are decorated with the @tool decorator.

In our case, we will use the first method, and we define our RetrieverTool below. We define a name, a description and a dictionary of inputs that the tool accepts. This helps the LLM properly identify and use the tool.

The RetrieverTool is simple: it takes a query generated by the user, and uses Couchbase's performant vector search service under the hood to search for semantically similar documents to the query. The LLM can then use this context to answer the user's question.

class RetrieverTool(Tool):
    name = "retriever"
    description = "Uses semantic search to retrieve the parts of transformers documentation that could be most relevant to answer your query."
    inputs = {
        "query": {
            "type": "string",
            "description": "The query to perform. This should be semantically close to your target documents. Use the affirmative form rather than a question.",
        }
    }
    output_type = "string"

    def __init__(self, vector_store: CouchbaseSearchVectorStore, **kwargs):
        super().__init__(**kwargs)
        self.vector_store = vector_store

    def forward(self, query: str) -> str:
        assert isinstance(query, str), "Query must be a string"

        docs = self.vector_store.similarity_search_with_score(query, k=5)
        return "\n\n".join(
            f"# Documents:\n{doc.page_content}"
            for doc, score in docs
        )

retriever_tool = RetrieverTool(vector_store)

Defining Our Agent

smolagents have predefined configurations for agents that we can use. We use the ToolCallingAgent, which writes its tool calls in a JSON format. Alternatively, there also exists a CodeAgent, in which the LLM defines it's functions in code.

The CodeAgent is offers benefits in certain challenging scenarios: it can lead to higher performance in difficult benchmarks and use 30% fewer steps to solve problems. However, since our use case is just a simple RAG tool, a ToolCallingAgent will suffice.

agent = ToolCallingAgent(
    tools=[retriever_tool],
    model=OpenAIServerModel(
        model_id="gpt-4o-2024-08-06",
        api_key=OPENAI_API_KEY,
    ),
    max_steps=4,
    verbosity_level=2
)

Running our Agent

We have now finished setting up our vector store and agent! The system is now ready to accept queries.

query = "What was manchester city manager pep guardiola's reaction to the team's current form?"

agent_output = agent.run(query)

╭──────────────────────────────────────────────────── New run ────────────────────────────────────────────────────╮
│                                                                                                                 │
│ What was manchester city manager pep guardiola's reaction to the team's current form?                           │
│                                                                                                                 │
╰─ OpenAIServerModel - gpt-4o-2024-08-06 ─────────────────────────────────────────────────────────────────────────╯

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

2025-02-28 10:32:28,032 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Calling tool: 'retriever' with arguments: {'query': "Pep Guardiola's reaction to Manchester City's current      │
│ form"}                                                                                                          │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

2025-02-28 10:32:28,466 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"

Observations: # Documents:
Manchester City boss Pep Guardiola has won 18 trophies since he arrived at the club in 2016

Manchester City boss Pep Guardiola says he is "fine" despite admitting his sleep and diet are being affected by the
worst run of results in his entire managerial career. In an interview with former Italy international Luca Toni for
Amazon Prime Sport before Wednesday's Champions League defeat by Juventus, Guardiola touched on the personal impact
City's sudden downturn in form has had. Guardiola said his state of mind was "ugly", that his sleep was "worse" and
he was eating lighter as his digestion had suffered. City go into Sunday's derby against Manchester United at
Etihad Stadium having won just one of their past 10 games. The Juventus loss means there is a chance they may not
even secure a play-off spot in the Champions League. Asked to elaborate on his comments to Toni, Guardiola said:
"I'm fine. "In our jobs we always want to do our best or the best as possible. When that doesn't happen you are
more uncomfortable than when the situation is going well, always that happened. "In good moments I am happier but
when I get to the next game I am still concerned about what I have to do. There is no human being that makes an
activity and it doesn't matter how they do." Guardiola said City have to defend better and "avoid making mistakes
at both ends". To emphasise his point, Guardiola referred back to the third game of City's current run, against a
Sporting side managed by Ruben Amorim, who will be in the United dugout at the weekend. City dominated the first
half in Lisbon, led thanks to Phil Foden's early effort and looked to be cruising. Instead, they conceded three
times in 11 minutes either side of half-time as Sporting eventually ran out 4-1 winners.

...

Afterwards Guardiola was calm, so much so it was difficult to hear him in the news conference, a contrast to the
frustrated figure he cut on the touchline. He said: "It depends on us. The solution is bring the players back. We
have just one central defender fit, that is difficult. We are going to try next game - another opportunity and we
don't think much further than that. "Of course there are more reasons. We concede the goals we don't concede in the
past, we |don't] score the goals we score in the past. Football is not just one reason. There are a lot of little
factors. "Last season we won the Premier League, but we came here and lost. We have to think positive and I have
incredible trust in the guys. Some of them have incredible pride and desire to do it. We have to find a way, step
by step, sooner or later to find a way back." Villa boss Unai Emery highlighted City's frailties, saying he felt
Villa could seize on the visitors' lack of belief. "Manchester City are a little bit under the confidence they have
normally," he said. "The second half was different, we dominated and we scored. Through those circumstances they
were feeling worse than even in the first half."

Erling Haaland had one touch in the Villa box

There are chinks in the armour never seen before at City under Guardiola and Erling Haaland conceded belief within
the squad is low. He told TNT after the game: "Of course, |confidence levels are] not the best. We know how
important confidence is and you can see that it affects every human being. That is how it is, we have to continue
and stay positive even though it is difficult." Haaland, with 76 goals in 83 Premier League appearances since
joining City from Borussia Dortmund in 2022, had one shot and one touch in the Villa box. His 18 touches in the
whole game were the lowest of all starting players and he has been self critical, despite scoring 13 goals in the
top flight this season. Over City's last eight games he has netted just twice though, but Guardiola refused to
criticise his star striker. He said: "Without him we will be even worse but I like the players feeling that way. I
don't agree with Erling. He needs to have the balls delivered in the right spots but he will fight for the next
one."

[Step 0: Duration 2.25 seconds| Input tokens: 1,010 | Output tokens: 23]

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

2025-02-28 10:32:31,724 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

╭─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Calling tool: 'final_answer' with arguments: {'answer': 'Manchester City manager Pep Guardiola has expressed a  │
│ mix of concern and determination regarding the team\'s current form. Guardiola admitted that this is the worst  │
│ run of results in his managerial career and that it has affected his sleep and diet. He described his state of  │
│ mind as "ugly" and acknowledged that City needs to defend better and avoid making mistakes. Despite his         │
│ personal challenges, Guardiola stated that he is "fine" and focused on finding solutions.\n\nGuardiola also     │
│ took responsibility for the team\'s struggles, stating he is "not good enough" and has to find solutions. He    │
│ expressed self-doubt but is striving to improve the team\'s situation step by step. Guardiola has faced         │
│ criticism due to the team\'s poor form, which has seen them lose several matches and fall behind in the title   │
│ race.\n\nHe emphasized the need to restore their defensive strength and regain confidence in their play.        │
│ Guardiola is planning a significant rebuild of the squad to address these challenges, aiming to replace several │
│ regular starters and emphasize improvements in the team\'s intensity and defensive concepts.'}                  │
╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯

Final answer: Manchester City manager Pep Guardiola has expressed a mix of concern and determination regarding the 
team's current form. Guardiola admitted that this is the worst run of results in his managerial career and that it 
has affected his sleep and diet. He described his state of mind as "ugly" and acknowledged that City needs to 
defend better and avoid making mistakes. Despite his personal challenges, Guardiola stated that he is "fine" and 
focused on finding solutions.

Guardiola also took responsibility for the team's struggles, stating he is "not good enough" and has to find 
solutions. He expressed self-doubt but is striving to improve the team's situation step by step. Guardiola has 
faced criticism due to the team's poor form, which has seen them lose several matches and fall behind in the title 
race.

He emphasized the need to restore their defensive strength and regain confidence in their play. Guardiola is 
planning a significant rebuild of the squad to address these challenges, aiming to replace several regular starters
and emphasize improvements in the team's intensity and defensive concepts.

[Step 1: Duration 2.74 seconds| Input tokens: 7,162 | Output tokens: 241]

Analyzing the Agent

When the agent runs, smolagents prints out the steps that the agent takes along with the tools called in each step. In the above tool call, two steps occur:

Step 1: First, the agent determines that it requires a tool to be used, and the retriever tool is called. The agent also specifies the query parameter for the tool (a string). The tool returns semantically similar documents to the query from Couchbase's vector store.

Step 2: Next, the agent determines that the context retrieved from the tool is sufficient to answer the question. It then calls the final_answer tool, which is predefined for each agent: this tool is called when the agent returns the final answer to the user. In this step, the LLM answers the user's query from the context retrieved in step 1 and passes it to the final_answer tool, at which point the agent's execution ends.

Conclusion

By following these steps, you’ll have a fully functional agentic RAG system that leverages the strengths of Couchbase and smolagents, along with OpenAI. This guide is designed not just to show you how to build the system, but also to explain why each step is necessary, giving you a deeper understanding of the principles behind semantic search and how to implement it effectively. Whether you’re a newcomer to software development or an experienced developer looking to expand your skills, this guide will provide you with the knowledge and tools you need to create a powerful, RAG-driven chat system.

Contents