In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and CrewAI for agent-based RAG operations. CrewAI allows us to create specialized agents that can work together to handle different aspects of the RAG workflow, from document retrieval to response generation. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
This tutorial is available as a Jupyter Notebook (.ipynb file) that you can run interactively. You can access the original notebook here.
You can either:
Create and Deploy Your Free Tier Operational cluster on Capella
Couchbase Capella Configuration When running Couchbase using Capella, the following prerequisites need to be met:
We'll install the following key libraries:
datasets
: For loading and managing our training datalangchain-couchbase
: To integrate Couchbase with LangChain for vector storage and cachinglangchain-openai
: For accessing OpenAI's embedding and chat modelscrewai
: To create and orchestrate our AI agents for RAG operationspython-dotenv
: For securely managing environment variables and API keysThese libraries provide the foundation for building a semantic search engine with vector embeddings, database integration, and agent-based RAG capabilities.
%pip install --quiet datasets==3.5.0 langchain-couchbase==0.3.0 langchain-openai==0.3.13 crewai==0.114.0 python-dotenv==1.1.0
Note: you may need to restart the kernel to use updated packages.
The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading.
import getpass
import json
import logging
import os
import time
from datetime import timedelta
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.diagnostics import PingState, ServiceType
from couchbase.exceptions import (InternalServerFailureException,
QueryIndexAlreadyExistsException,
ServiceUnavailableException)
from couchbase.management.buckets import CreateBucketSettings
from couchbase.management.search import SearchIndex
from couchbase.options import ClusterOptions
from datasets import load_dataset
from dotenv import load_dotenv
from crewai.tools import tool
from langchain_couchbase.vectorstores import CouchbaseSearchVectorStore
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from crewai import Agent, Crew, Process, Task
Logging is configured to track the progress of the script and capture any errors or warnings.
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
# Suppress httpx logging
logging.getLogger('httpx').setLevel(logging.CRITICAL)
In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like database credentials, and specific configuration names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.
The script uses environment variables to store sensitive information, enhancing the overall security and maintainability of your code by avoiding hardcoded values.
# Load environment variables
load_dotenv()
# Configuration
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') or input("Enter your OpenAI API key: ")
if not OPENAI_API_KEY:
raise ValueError("OPENAI_API_KEY is not set")
CB_HOST = os.getenv('CB_HOST') or input("Enter Couchbase host (default: couchbase://localhost): ") or 'couchbase://localhost'
CB_USERNAME = os.getenv('CB_USERNAME') or input("Enter Couchbase username (default: Administrator): ") or 'Administrator'
CB_PASSWORD = os.getenv('CB_PASSWORD') or getpass.getpass("Enter Couchbase password (default: password): ") or 'password'
CB_BUCKET_NAME = os.getenv('CB_BUCKET_NAME') or input("Enter bucket name (default: vector-search-testing): ") or 'vector-search-testing'
INDEX_NAME = os.getenv('INDEX_NAME') or input("Enter index name (default: vector_search_crew): ") or 'vector_search_crew'
SCOPE_NAME = os.getenv('SCOPE_NAME') or input("Enter scope name (default: shared): ") or 'shared'
COLLECTION_NAME = os.getenv('COLLECTION_NAME') or input("Enter collection name (default: crew): ") or 'crew'
print("Configuration loaded successfully")
Configuration loaded successfully
Connecting to a Couchbase cluster is the foundation of our project. Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our semantic search engine. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections. This connection is the gateway through which all data will flow, so ensuring it's set up correctly is paramount.
# Connect to Couchbase
try:
auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(CB_HOST, options)
cluster.wait_until_ready(timedelta(seconds=5))
print("Successfully connected to Couchbase")
except Exception as e:
print(f"Failed to connect to Couchbase: {str(e)}")
raise
Successfully connected to Couchbase
In this section, we verify that the Couchbase Search (FTS) service is available and responding correctly. This is a crucial check because our vector search functionality depends on it. If any issues are detected with the Search service, the function will raise an exception, allowing us to catch and handle problems early before attempting vector operations.
def check_search_service(cluster):
"""Verify search service availability using ping"""
try:
# Get ping result
ping_result = cluster.ping()
search_available = False
# Check if search service is responding
for service_type, endpoints in ping_result.endpoints.items():
if service_type == ServiceType.Search:
for endpoint in endpoints:
if endpoint.state == PingState.OK:
search_available = True
print(f"Search service is responding at: {endpoint.remote}")
break
break
if not search_available:
raise RuntimeError("Search/FTS service not found or not responding")
print("Search service check passed successfully")
except Exception as e:
print(f"Health check failed: {str(e)}")
raise
try:
check_search_service(cluster)
except Exception as e:
print(f"Failed to check search service: {str(e)}")
raise
Search service is responding at: 3.235.224.172:18094
Search service check passed successfully
The setup_collection() function handles creating and configuring the hierarchical data organization in Couchbase:
Bucket Creation:
Scope Management:
Collection Setup:
Additional Tasks:
The function is called twice to set up:
def setup_collection(cluster, bucket_name, scope_name, collection_name):
try:
# Check if bucket exists, create if it doesn't
try:
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' exists.")
except Exception as e:
logging.info(f"Bucket '{bucket_name}' does not exist. Creating it...")
bucket_settings = CreateBucketSettings(
name=bucket_name,
bucket_type='couchbase',
ram_quota_mb=1024,
flush_enabled=True,
num_replicas=0
)
cluster.buckets().create_bucket(bucket_settings)
time.sleep(2) # Wait for bucket creation to complete and become available
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' created successfully.")
bucket_manager = bucket.collections()
# Check if scope exists, create if it doesn't
scopes = bucket_manager.get_all_scopes()
scope_exists = any(scope.name == scope_name for scope in scopes)
if not scope_exists and scope_name != "_default":
logging.info(f"Scope '{scope_name}' does not exist. Creating it...")
bucket_manager.create_scope(scope_name)
logging.info(f"Scope '{scope_name}' created successfully.")
# Check if collection exists, create if it doesn't
collections = bucket_manager.get_all_scopes()
collection_exists = any(
scope.name == scope_name and collection_name in [col.name for col in scope.collections]
for scope in collections
)
if not collection_exists:
logging.info(f"Collection '{collection_name}' does not exist. Creating it...")
bucket_manager.create_collection(scope_name, collection_name)
logging.info(f"Collection '{collection_name}' created successfully.")
else:
logging.info(f"Collection '{collection_name}' already exists. Skipping creation.")
# Wait for collection to be ready
collection = bucket.scope(scope_name).collection(collection_name)
time.sleep(2) # Give the collection time to be ready for queries
# Ensure primary index exists
try:
cluster.query(f"CREATE PRIMARY INDEX IF NOT EXISTS ON `{bucket_name}`.`{scope_name}`.`{collection_name}`").execute()
logging.info("Primary index present or created successfully.")
except Exception as e:
logging.warning(f"Error creating primary index: {str(e)}")
# Clear all documents in the collection
try:
query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
cluster.query(query).execute()
logging.info("All documents cleared from the collection.")
except Exception as e:
logging.warning(f"Error while clearing documents: {str(e)}. The collection might be empty.")
return collection
except Exception as e:
raise RuntimeError(f"Error setting up collection: {str(e)}")
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)
2025-05-25 02:56:12 [INFO] Bucket 'vector-search-testing' exists.
2025-05-25 02:56:14 [INFO] Collection 'crew' already exists. Skipping creation.
2025-05-25 02:56:17 [INFO] Primary index present or created successfully.
2025-05-25 02:56:17 [INFO] All documents cleared from the collection.
<couchbase.collection.Collection at 0x3131a9950>
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase Vector Search Index comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
This CrewAI vector search index configuration requires specific default settings to function properly. This tutorial uses the bucket named vector-search-testing
with the scope shared
and collection crew
. The configuration is set up for vectors with exactly 1536 dimensions
, using dot product
similarity and optimized for recall
. If you want to use a different bucket, scope, or collection, you will need to modify the index configuration accordingly.
For more information on creating a vector search index, please follow the instructions at Couchbase Vector Search Documentation.
# Load index definition
try:
with open('crew_index.json', 'r') as file:
index_definition = json.load(file)
except FileNotFoundError as e:
print(f"Error: crew_index.json file not found: {str(e)}")
raise
except json.JSONDecodeError as e:
print(f"Error: Invalid JSON in crew_index.json: {str(e)}")
raise
except Exception as e:
print(f"Error loading index definition: {str(e)}")
raise
With the index definition loaded, the next step is to create or update the Vector Search Index in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
try:
scope_index_manager = cluster.bucket(CB_BUCKET_NAME).scope(SCOPE_NAME).search_indexes()
# Check if index already exists
existing_indexes = scope_index_manager.get_all_indexes()
index_name = index_definition["name"]
if index_name in [index.name for index in existing_indexes]:
logging.info(f"Index '{index_name}' found")
else:
logging.info(f"Creating new index '{index_name}'...")
# Create SearchIndex object from JSON definition
search_index = SearchIndex.from_json(index_definition)
# Upsert the index (create if not exists, update if exists)
scope_index_manager.upsert_index(search_index)
logging.info(f"Index '{index_name}' successfully created/updated.")
except QueryIndexAlreadyExistsException:
logging.info(f"Index '{index_name}' already exists. Skipping creation/update.")
except ServiceUnavailableException:
raise RuntimeError("Search service is not available. Please ensure the Search service is enabled in your Couchbase cluster.")
except InternalServerFailureException as e:
logging.error(f"Internal server error: {str(e)}")
raise
2025-05-25 02:56:18 [INFO] Index 'vector_search_crew' found
2025-05-25 02:56:19 [INFO] Index 'vector_search_crew' already exists. Skipping creation/update.
This section initializes two key OpenAI components needed for our RAG system:
OpenAI Embeddings:
ChatOpenAI Language Model:
Both components require a valid OpenAI API key (OPENAI_API_KEY) for authentication. In the CrewAI framework, the LLM acts as the "brain" for each agent, allowing them to interpret tasks, retrieve relevant information via the RAG system, and generate appropriate outputs based on their specialized roles and expertise.
# Initialize OpenAI components
embeddings = OpenAIEmbeddings(
openai_api_key=OPENAI_API_KEY,
model="text-embedding-3-small"
)
llm = ChatOpenAI(
openai_api_key=OPENAI_API_KEY,
model="gpt-4o",
temperature=0.2
)
print("OpenAI components initialized")
OpenAI components initialized
A vector store is where we'll keep our embeddings. Unlike the FTS index, which is used for text-based search, the vector store is specifically designed to handle embeddings and perform similarity searches. When a user inputs a query, the search engine converts the query into an embedding and compares it against the embeddings stored in the vector store. This allows the engine to find documents that are semantically similar to the query, even if they don't contain the exact same words. By setting up the vector store in Couchbase, we create a powerful tool that enables our search engine to understand and retrieve information based on the meaning and context of the query, rather than just the specific words used.
# Setup vector store
vector_store = CouchbaseSearchVectorStore(
cluster=cluster,
bucket_name=CB_BUCKET_NAME,
scope_name=SCOPE_NAME,
collection_name=COLLECTION_NAME,
embedding=embeddings,
index_name=INDEX_NAME,
)
print("Vector store initialized")
Vector store initialized
To build a search engine, we need data to search through. We use the BBC News dataset from RealTimeData, which provides real-world news articles. This dataset contains news articles from BBC covering various topics and time periods. Loading the dataset is a crucial step because it provides the raw material that our search engine will work with. The quality and diversity of the news articles make it an excellent choice for testing and refining our search engine, ensuring it can handle real-world news content effectively.
The BBC News dataset allows us to work with authentic news articles, enabling us to build and test a search engine that can effectively process and retrieve relevant news content. The dataset is loaded using the Hugging Face datasets library, specifically accessing the "RealTimeData/bbc_news_alltime" dataset with the "2024-12" version.
try:
news_dataset = load_dataset(
"RealTimeData/bbc_news_alltime", "2024-12", split="train"
)
print(f"Loaded the BBC News dataset with {len(news_dataset)} rows")
logging.info(f"Successfully loaded the BBC News dataset with {len(news_dataset)} rows.")
except Exception as e:
raise ValueError(f"Error loading the BBC News dataset: {str(e)}")
2025-05-25 02:56:29 [INFO] Successfully loaded the BBC News dataset with 2687 rows.
Loaded the BBC News dataset with 2687 rows
We will use the content of the news articles for our RAG system.
The dataset contains a few duplicate records. We are removing them to avoid duplicate results in the retrieval stage of our RAG system.
news_articles = news_dataset["content"]
unique_articles = set()
for article in news_articles:
if article:
unique_articles.add(article)
unique_news_articles = list(unique_articles)
print(f"We have {len(unique_news_articles)} unique articles in our database.")
We have 1749 unique articles in our database.
To efficiently handle the large number of articles, we process them in batches of articles at a time. This batch processing approach helps manage memory usage and provides better control over the ingestion process.
We first filter out any articles that exceed 50,000 characters to avoid potential issues with token limits. Then, using the vector store's add_texts method, we add the filtered articles to our vector database. The batch_size parameter controls how many articles are processed in each iteration.
This approach offers several benefits:
We use a conservative batch size of 50 to ensure reliable operation. The optimal batch size depends on many factors including:
Consider measuring performance with your specific workload before adjusting.
batch_size = 50
# Automatic Batch Processing
articles = [article for article in unique_news_articles if article and len(article) <= 50000]
try:
vector_store.add_texts(
texts=articles,
batch_size=batch_size
)
logging.info("Document ingestion completed successfully.")
except Exception as e:
raise ValueError(f"Failed to save documents to vector store: {str(e)}")
2025-05-25 02:58:12 [INFO] Document ingestion completed successfully.
After loading our data into the vector store, we need to create a tool that can efficiently search through these vector embeddings. This involves two key components:
The vector retriever is configured to perform similarity searches. This creates a retriever that performs semantic similarity searches against our vector database. The similarity search finds documents whose vector embeddings are closest to the query's embedding in the vector space.
The search tool wraps the retriever in a user-friendly interface that:
The tool is designed to integrate seamlessly with our AI agents, providing them with reliable access to our knowledge base through vector similarity search. The lambda function in the tool handles both direct string queries and structured query objects, ensuring flexibility in how the tool can be invoked.
# Create vector retriever
retriever = vector_store.as_retriever(
search_type="similarity",
)
# Define the search tool using the @tool decorator
@tool("vector_search")
def search_tool(query: str) -> str:
"""Search for relevant documents using vector similarity.
Input should be a simple text query string.
Returns a list of relevant document contents.
Use this tool to find detailed information about topics."""
# Handle potential non-string query input if needed (similar to original lambda)
# CrewAI usually passes the string directly based on task description
# but checking doesn't hurt, though the Agent logic might handle this.
# query_str = query if isinstance(query, str) else str(query.get('query', '')) # Simplified for now
# Invoke the retriever
docs = retriever.invoke(query)
# Format the results
formatted_docs = "\n\n".join([
f"Document {i+1}:\n{'-'*40}\n{doc.page_content}"
for i, doc in enumerate(docs)
])
return formatted_docs
We'll create two specialized AI agents using the CrewAI framework to handle different aspects of our information retrieval and analysis system:
This agent is designed to:
This agent is responsible for:
The agents work together in a coordinated way:
This multi-agent approach allows us to:
# Custom response template
response_template = """
Analysis Results
===============
{%- if .Response %}
{{ .Response }}
{%- endif %}
Sources
=======
{%- for tool in .Tools %}
* {{ tool.name }}
{%- endfor %}
Metadata
========
* Confidence: {{ .Confidence }}
* Analysis Time: {{ .ExecutionTime }}
"""
# Create research agent
researcher = Agent(
role='Research Expert',
goal='Find and analyze the most relevant documents to answer user queries accurately',
backstory="""You are an expert researcher with deep knowledge in information retrieval
and analysis. Your expertise lies in finding, evaluating, and synthesizing information
from various sources. You have a keen eye for detail and can identify key insights
from complex documents. You always verify information across multiple sources and
provide comprehensive, accurate analyses.""",
tools=[search_tool],
llm=llm,
verbose=True,
memory=True,
allow_delegation=False,
response_template=response_template
)
# Create writer agent
writer = Agent(
role='Technical Writer',
goal='Generate clear, accurate, and well-structured responses based on research findings',
backstory="""You are a skilled technical writer with expertise in making complex
information accessible and engaging. You excel at organizing information logically,
explaining technical concepts clearly, and creating well-structured documents. You
ensure all information is properly cited, accurate, and presented in a user-friendly
manner. You have a talent for maintaining the reader's interest while conveying
detailed technical information.""",
llm=llm,
verbose=True,
memory=True,
allow_delegation=False,
response_template=response_template
)
print("Agents created successfully")
Agents created successfully
This system uses a two-agent approach to implement Retrieval-Augmented Generation (RAG):
Research Expert Agent:
Technical Writer Agent:
This multi-agent approach separates concerns (research vs. writing) and leverages specialized expertise for each task, resulting in higher quality responses.
Test the system with some example queries.
def process_query(query, researcher, writer):
"""
Test the complete RAG system with a user query.
This function tests both the vector search capability and the agent-based processing:
1. Vector search: Retrieves relevant documents from Couchbase
2. Agent processing: Uses CrewAI agents to analyze and format the response
The function measures performance and displays detailed outputs from each step.
"""
print(f"\nQuery: {query}")
print("-" * 80)
# Create tasks
research_task = Task(
description=f"Research and analyze information relevant to: {query}",
agent=researcher,
expected_output="A detailed analysis with key findings and supporting evidence"
)
writing_task = Task(
description="Create a comprehensive and well-structured response",
agent=writer,
expected_output="A clear, comprehensive response that answers the query",
context=[research_task]
)
# Create and execute crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
process=Process.sequential,
verbose=True,
cache=True,
planning=True
)
try:
start_time = time.time()
result = crew.kickoff()
elapsed_time = time.time() - start_time
print(f"\nQuery completed in {elapsed_time:.2f} seconds")
print("=" * 80)
print("RESPONSE")
print("=" * 80)
print(result)
if hasattr(result, 'tasks_output'):
print("\n" + "=" * 80)
print("DETAILED TASK OUTPUTS")
print("=" * 80)
for task_output in result.tasks_output:
print(f"\nTask: {task_output.description[:100]}...")
print("-" * 40)
print(f"Output: {task_output.raw}")
print("-" * 40)
except Exception as e:
print(f"Error executing crew: {str(e)}")
logging.error(f"Crew execution failed: {str(e)}", exc_info=True)
# Disable logging before running the query
logging.disable(logging.CRITICAL)
query = "What are the key details about the FA Cup third round draw? Include information about Manchester United vs Arsenal, Tamworth vs Tottenham, and other notable fixtures."
process_query(query, researcher, writer)
Query: What are the key details about the FA Cup third round draw? Include information about Manchester United vs Arsenal, Tamworth vs Tottenham, and other notable fixtures.
--------------------------------------------------------------------------------
╭──────────────────────────────────────────── Crew Execution Started ─────────────────────────────────────────────╮ │ │ │ Crew Execution Started │ │ Name: crew │ │ ID: d8b9aa65-b394-4caf-922d-4a4b5de60c7e │ │ │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
[2025-05-25 02:58:12][INFO]: Planning the crew execution
🚀 Crew: crew └── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 Status: Executing Task...
🚀 Crew: crew └── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 Status: Executing Task... └── 🤖 Agent: Task Execution Planner Status: In Progress
🤖 Agent: Task Execution Planner Status: In Progress └── 🧠 Thinking...
🤖 Agent: Task Execution Planner Status: In Progress
🚀 Crew: crew └── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 Status: Executing Task... └── 🤖 Agent: Task Execution Planner Status: ✅ Completed
🚀 Crew: crew └── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 Assigned to: Task Execution Planner Status: ✅ Completed └── 🤖 Agent: Task Execution Planner Status: ✅ Completed
╭──────────────────────────────────────────────── Task Completion ────────────────────────────────────────────────╮ │ │ │ Task Completed │ │ Name: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ │ Agent: Task Execution Planner │ │ │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed └── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 Status: Executing Task...
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed └── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 Status: Executing Task... └── 🤖 Agent: Research Expert Status: In Progress
# Agent: Research Expert
## Task: Research and analyze information relevant to: What are the key details about the FA Cup third round draw? Include information about Manchester United vs Arsenal, Tamworth vs Tottenham, and other notable fixtures.1. The Research Expert will begin by defining the scope of research by identifying key aspects related to the FA Cup third round draw, including historical context, significance of the matches, and team statistics.
2. The expert will formulate a specific query string that encapsulates the required information, focusing on Manchester United vs Arsenal, Tamworth vs Tottenham, and other relevant fixtures.
3. Using the 'vector_search' tool, the expert will input the generated query string to initiate a search for relevant documents containing current and historical information regarding the FA Cup third round draw.
4. The expert will analyze the retrieved documents, identifying key details such as match venue, date, historical performances, and recent form of each team involved.
5. Key findings will be noted meticulously, ensuring each match's context is articulated accurately, particularly emphasizing the implications of the fixtures for the teams involved.
6. The research will conclude with a summary of findings for each identified match, ensuring all relevant insights are captured for later stages.
🤖 Agent: Research Expert Status: In Progress
# Agent: Research Expert
## Thought: Thought: I need to gather detailed information about the FA Cup third round draw, focusing on the matches Manchester United vs Arsenal, Tamworth vs Tottenham, and other notable fixtures. I will use the vector_search tool to find relevant documents that provide insights into these matches, including historical context, significance, and team statistics.
## Using tool: vector_search
## Tool Input:
"{\"query\": \"FA Cup third round draw 2023 Manchester United vs Arsenal Tamworth vs Tottenham notable fixtures\"}"
## Tool Output:
Document 1:
----------------------------------------
Holders Manchester United have been drawn away to record 14-time winners Arsenal in the FA Cup third round.
Premier League leaders Liverpool will host League Two Accrington Stanley, while Manchester City welcome 'Class of 92'-owned Salford City.
Tamworth, one of only two non-league clubs remaining in the competition, are at home to Tottenham.
The third-round ties will be played over the weekend of Saturday, 11 January.
The third round is when the 44 Premier League and Championship clubs enter the competition, joining the 20 lower-league and non-league clubs who made it through last weekend's second-round ties.
There were audible groans from the watching supporters inside Old Trafford as Manchester United, who beat rivals Manchester City to lift the trophy for a 13th time in May, were confirmed as Arsenal's opponents.
Tamworth, the lowest-ranked team remaining in the cup, will host Ange Postecoglou's Spurs as reward for their penalty shootout win against League One side Burton Albion, while fellow National League outfit Dagenham & Redbridge will go to Championship Millwall.
Everton full-back Ashley Young, 39, could face his 18-year-old son Tyler after the Toffees drew Peterborough at home.
"Wow...dreams might come true," posted former England international Young on X.
Elsewhere, Chelsea host League Two's bottom club Morecambe, whose fellow fourth-tier strugglers Bromley travel to face Newcastle United at St James' Park.
Document 2:
----------------------------------------
Adam Idah won last season's Scottish Cup for Celtic with a last-gasp goal against Rangers at Hampden in May
Holders Celtic will host Kilmarnock in the fourth round of the Scottish Cup, while Rangers welcome Highland League side Fraserburgh to Ibrox. There will be a Dundee derby at Dens Park, St Johnstone take on Motherwell in Perth and Aberdeen make the short trip to Elgin City. Hibernian will be at home against Clydebank, currently unbeaten in the West of Scotland Football League, and Hearts visit Highland League leaders Brechin City. Championship pace-setters Falkirk won 3-1 at East Kilbride on Monday evening and will meet league rivals Raith Rovers next. Musselburgh Athletic of the East of Scotland Premier Division go to Hamilton Accies and Lowland League representatives Broxburn Athletic have a home tie with Ayr United. The fourth-round matches will be played on the weekend of 18-19 January.
Former Scotland forward Steven Naismith, who won the trophy with Rangers in 2009, conducted the draw at East Kilbride's K-Park
Fraserburgh manager says the prospect of facing Rangers is "a bit surreal". "It's something many, many at our level only dream of," Cowie told BBC Radio Scotland's Good Morning Scotland. "We've got quite a few in the club who are Rangers supporters, so for them it truly is a dream come true, but we're just humbled to have been given this opportunity. Like I said, just to be in the draw, but to pull out a tie such as that one it's going to be incredible. "It'll be rewarding in so many ways, obviously financially, it'll be great for a club and give us a boost during these difficult times, but also in terms of memories, this tie will create for players, supporters and the like. It's truly got to be a great occasion for the club." With almost seven weeks to go until the trip to Ibrox, Cowie says Fraserburgh will "have to try and keep a lid on it" with Highland League and Aberdeenshire Shield games to come first. "The Rangers game is a bit away yet, we've got some league games and cup games that we want to do well in beforehand," he added. "We haven't been together since, so there will be a bit of excitement, but we've got a good group, so hopefully we'll manage to focus on tonight's game [against Keith] and put the Rangers game to the side slightly."
Document 3:
----------------------------------------
Man Utd are better with Rashford - Amorim
This video can not be played To play this video you need to enable JavaScript in your browser. I just want to help Marcus - Amorim on Rashford
Manchester United manager Ruben Amorim says the club are "better" with Marcus Rashford after the forward suggested he could leave Old Trafford. The England international, 27, said on Tuesday that he was "ready for a new challenge and the next steps" in his career. It came two days after Rashford was dropped for United's derby win against Manchester City at Etihad Stadium. Rashford's last Premier League start came in a 4-0 win against Everton on 1 December, when he scored twice. Amorim suggested the club want the striker - who came through United's youth ranks - to stay, saying: "I don't talk about the future, we talk about the present. "This kind of club needs big talent and he's a big talent, so he just needs to perform at the highest level and that is my focus. I just want to help Marcus." Asked about Rashford's desire for a "new challenge", Amorim said: "I think it's right. We have here a new challenge, it's a tough one. "For me it's the biggest challenge in football because we are in a difficult situation and I already said this is one of the biggest clubs in the world. "I really hope all my players are ready for this new challenge." Amorim added the club will "try different things" to help Rashford find the "best levels he has shown in the past". "Nothing has changed - we believe in Marcus," said Amorim. "It's hard to explain to you guys what I am going to do. I'm a little bit emotional so in the moment I will decide what to do. "It's a hard situation to comment [on]. If I give a lot of importance it will have big headlines in the papers and if I say it's not a problem then my standards are getting low." Asked if he, when he was a player, would do an interview or speak privately, Amorim said: "If this was me probably I would speak with the manager."
Manchester United face Tottenham in the quarter-finals of the Carabao Cup on Thursday (20:00 GMT). Amorim refused to confirm whether Rashford or winger Alejandro Garnacho - who was also left out of the squad to face Manchester City - would feature against Spurs. However, Rashford was not pictured travelling with the team when they left for London, but Garnacho was. Asked how Garnacho has reacted to being left out, Amorim said: "Really good - he trained really well. He seems a little bit upset with me and that's perfect. "I was really, really happy because I would do the same. He's ready for this game." One player that will not be available is midfielder Mason Mount, who went off in the 14th minute of Sunday's win. Amorim said his injury was still being assessed and Mount was "really sad" in the dressing room, adding "we need to help him".
• None What happens now with Man Utd and Rashford?
Marcus Rashford has started three of Manchester United's seven matches since Ruben Amorim was appointed
Rashford has scored 138 goals in 426 appearances for United since his debut in 2016. His most prolific season was 2022-23, when he scored 30 times in 56 games in all competitions and was rewarded with a new five-year deal. Rashford's goals in that campaign account for more than one-fifth (21.7%) of his total tally across nine and a half seasons at Old Trafford. However, he has struggled for form in the past 18 months, with 15 goals in his past 67 appearances. The forward's shot conversion rate was 11.9% in the 2021-22 season, when he scored five goals in 32 matches. That rate increased to 18% the following season when he scored 30 times, but fell to just 9.4% last term - the worst rate of his career across a whole campaign - as he netted eight times in 43 matches. Since 2019-20, United have won 52.7% of their matches in all competitions with Rashford in the starting line-up (107 wins from 203 games), compared to 54.2% without (58 from 107).
If Manchester United offered guidance to avoid creating even more turmoil around an already delicate situation, Ruben Amorim has followed it. We already know enough of Amorim to know he will not hold back just for the sake of it, but this is a case where actions will speak louder than words. Amorim says he wants "big talent" Rashford to stay. But also that players have to meet his standards. He says Rashford - and Alejandro Garnacho - will be assessed for selection on their training-ground performances. It is fair to assume therefore that if neither reaches the required standard, they will not travel to London for the EFL Cup tie at Tottenham - even if Rashford has shaken off the illness that prevented him from training on Monday. After Tottenham, United have Premier League games against Bournemouth, Wolves and Newcastle. We will know soon enough where Rashford fits in Amorim's plans. If he fails to reach the standards his new boss demands, the 27-year-old will not feature.
Document 4:
----------------------------------------
Leny Yoro signed for Manchester United from Lille for £52m in the summer
Leny Yoro could make his Manchester United debut against Arsenal on Wednesday, with head coach Ruben Amorim saying he is "excited" to see the "special talent" in action for the club. United signed Yoro from Lille for £52m in the summer, but the Frenchman underwent surgery on a foot injury suffered during pre-season in the United States, coincidentally in a friendly against Arsenal. The 19-year-old has not featured since, but returned to training in October and is in line to make his first competitive appearance in the Premier League at Emirates Stadium (kick-off 20:15 GMT). "Maybe Leny Yoro is going be in the squad," Amorim said in his pre-match news conference. "I feel he is in a good moment now, his fitness is better." Yoro had been linked with Real Madrid, Paris St-Germain and Liverpool before signing for United in July, with the club's sporting director Dan Ashworth describing the centre-back as "one of the most exciting young defenders in world football". Amorim said: "He is a special talent, we have to be careful in the first moment. We didn't have too many training [sessions] together. He has been training with a small group of players. "He is really fast, a modern defender. He will be good when we want to press high and you leave a lot of strikers in this league one against one, he can manage that. He is very good with the ball so I am very excited. "We have to be careful, we have to manage the load and minute in the beginning but I am really excited to see Leny Yoro playing."
United are unbeaten in three games since Amorim took charge, thrashing Everton 4-0 on Sunday, but said Arsenal are "by far" the biggest challenge he has faced in the job so far. Red Devils supporters seem to already be building a rapport with their new Portuguese boss, serenading him with chants that echoed around Old Trafford. “I don't like my chant," Amorim told BBC Sport when asked about the connection with the fans. "I don't feel embarrassed - but I am the coach, they have to chant for the players. It is not a good thing for me, "I understand and appreciate the connection with the supporters but I want them to support the team and the players because they are on the pitch, I am outside. "I am really honoured and I feel a connection with the supporters but we know we need results to maintain that."
🤖 Agent: Research Expert Status: In Progress └── 🧠 Thinking...
🤖 Agent: Research Expert Status: In Progress
# Agent: Research Expert
## Final Answer:
Document 1:
----------------------------------------
Holders Manchester United have been drawn away to record 14-time winners Arsenal in the FA Cup third round.
Premier League leaders Liverpool will host League Two Accrington Stanley, while Manchester City welcome 'Class of 92'-owned Salford City.
Tamworth, one of only two non-league clubs remaining in the competition, are at home to Tottenham.
The third-round ties will be played over the weekend of Saturday, 11 January.
The third round is when the 44 Premier League and Championship clubs enter the competition, joining the 20 lower-league and non-league clubs who made it through last weekend's second-round ties.
There were audible groans from the watching supporters inside Old Trafford as Manchester United, who beat rivals Manchester City to lift the trophy for a 13th time in May, were confirmed as Arsenal's opponents.
Tamworth, the lowest-ranked team remaining in the cup, will host Ange Postecoglou's Spurs as reward for their penalty shootout win against League One side Burton Albion, while fellow National League outfit Dagenham & Redbridge will go to Championship Millwall.
Everton full-back Ashley Young, 39, could face his 18-year-old son Tyler after the Toffees drew Peterborough at home.
"Wow...dreams might come true," posted former England international Young on X.
Elsewhere, Chelsea host League Two's bottom club Morecambe, whose fellow fourth-tier strugglers Bromley travel to face Newcastle United at St James' Park.
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed └── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 Status: Executing Task... └── 🤖 Agent: Research Expert Status: ✅ Completed
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed └── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 Assigned to: Research Expert Status: ✅ Completed └── 🤖 Agent: Research Expert Status: ✅ Completed
╭──────────────────────────────────────────────── Task Completion ────────────────────────────────────────────────╮ │ │ │ Task Completed │ │ Name: 6c97cf00-62f0-4478-8b01-7b677f6277c5 │ │ Agent: Research Expert │ │ │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed ├── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 │ Assigned to: Research Expert │ Status: ✅ Completed │ └── 🤖 Agent: Research Expert │ Status: ✅ Completed └── 📋 Task: 9a2f521d-7197-4900-9dd5-3f28e67cba96 Status: Executing Task...
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed ├── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 │ Assigned to: Research Expert │ Status: ✅ Completed │ └── 🤖 Agent: Research Expert │ Status: ✅ Completed └── 📋 Task: 9a2f521d-7197-4900-9dd5-3f28e67cba96 Status: Executing Task... └── 🤖 Agent: Technical Writer Status: In Progress
# Agent: Technical Writer
## Task: Create a comprehensive and well-structured response1. The Technical Writer will begin by reviewing the key findings and detailed analysis provided by the Research Expert.
2. The writer will outline the structure of the response, ensuring that it flows logically and covers all necessary topics. This may include sections for an introduction, match summaries, and conclusion.
3. Each section will be filled with clear, concise information while ensuring technical accuracy and relevance to the query.
4. The writer will ensure that the language used is appropriate for the audience, aiming to make complex information accessible.
5. After drafting the response, the writer will review and refine the content for clarity, coherence, and completeness.
6. Finally, the writer will verify that all key points regarding Manchester United vs Arsenal, Tamworth vs Tottenham, and notable fixtures are articulated clearly, providing comprehensive insights within the context of the FA Cup third round draw.
# Agent: Technical Writer
## Final Answer:
**FA Cup Third Round Draw: Comprehensive Overview**
**Introduction**
The FA Cup third round draw has set the stage for an exciting series of matches, with notable fixtures featuring both top-tier and lower-league clubs. This stage marks the entry of the 44 Premier League and Championship clubs into the competition, joining 20 lower-league and non-league clubs that have successfully navigated the second round. The matches are scheduled to take place over the weekend of Saturday, 11 January.
**Key Match Summaries**
1. **Manchester United vs Arsenal**
- Holders Manchester United, who recently celebrated their 13th FA Cup victory, have been drawn away to face Arsenal, the record 14-time winners. This fixture promises to be a thrilling encounter, as both teams have a rich history in the competition. The draw was met with audible groans from supporters at Old Trafford, highlighting the challenge that lies ahead for United.
2. **Tamworth vs Tottenham**
- Tamworth, one of only two non-league clubs remaining in the competition, will host Premier League side Tottenham Hotspur. This match is particularly significant for Tamworth, the lowest-ranked team still in the cup, as they earned their spot by defeating League One side Burton Albion in a dramatic penalty shootout. Hosting a top-tier team like Spurs is a remarkable achievement and a testament to their determination and skill.
3. **Other Notable Fixtures**
- Premier League leaders Liverpool will host League Two's Accrington Stanley, while Manchester City will welcome Salford City, owned by the 'Class of 92'. These matches highlight the diverse range of clubs participating in the third round.
- Everton's draw against Peterborough presents a unique scenario where Everton full-back Ashley Young, 39, could potentially face his 18-year-old son Tyler, adding a personal narrative to the fixture.
- Chelsea will play against Morecambe, the bottom club of League Two, and Newcastle United will face Bromley, another fourth-tier team, at St James' Park.
**Conclusion**
The FA Cup third round draw has set up a series of intriguing matches that blend the excitement of top-tier clashes with the charm of underdog stories. With Manchester United facing Arsenal and Tamworth hosting Tottenham, fans can look forward to a weekend filled with high-stakes football and potential upsets. This stage of the competition continues to embody the spirit of the FA Cup, where dreams can become reality, and every team has a chance to shine on the national stage.
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed ├── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 │ Assigned to: Research Expert │ Status: ✅ Completed │ └── 🤖 Agent: Research Expert │ Status: ✅ Completed └── 📋 Task: 9a2f521d-7197-4900-9dd5-3f28e67cba96 Status: Executing Task... └── 🤖 Agent: Technical Writer Status: ✅ Completed
🚀 Crew: crew ├── 📋 Task: ef4cb87a-fab3-41ca-b2c3-95a4486b9751 │ Assigned to: Task Execution Planner │ Status: ✅ Completed │ └── 🤖 Agent: Task Execution Planner │ Status: ✅ Completed ├── 📋 Task: 6c97cf00-62f0-4478-8b01-7b677f6277c5 │ Assigned to: Research Expert │ Status: ✅ Completed │ └── 🤖 Agent: Research Expert │ Status: ✅ Completed └── 📋 Task: 9a2f521d-7197-4900-9dd5-3f28e67cba96 Assigned to: Technical Writer Status: ✅ Completed └── 🤖 Agent: Technical Writer Status: ✅ Completed
╭──────────────────────────────────────────────── Task Completion ────────────────────────────────────────────────╮ │ │ │ Task Completed │ │ Name: 9a2f521d-7197-4900-9dd5-3f28e67cba96 │ │ Agent: Technical Writer │ │ │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
╭──────────────────────────────────────────────── Crew Completion ────────────────────────────────────────────────╮ │ │ │ Crew Execution Completed │ │ Name: crew │ │ ID: d8b9aa65-b394-4caf-922d-4a4b5de60c7e │ │ │ │ │ ╰─────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Query completed in 20.01 seconds
================================================================================
RESPONSE
================================================================================
**FA Cup Third Round Draw: Comprehensive Overview**
**Introduction**
The FA Cup third round draw has set the stage for an exciting series of matches, with notable fixtures featuring both top-tier and lower-league clubs. This stage marks the entry of the 44 Premier League and Championship clubs into the competition, joining 20 lower-league and non-league clubs that have successfully navigated the second round. The matches are scheduled to take place over the weekend of Saturday, 11 January.
**Key Match Summaries**
1. **Manchester United vs Arsenal**
- Holders Manchester United, who recently celebrated their 13th FA Cup victory, have been drawn away to face Arsenal, the record 14-time winners. This fixture promises to be a thrilling encounter, as both teams have a rich history in the competition. The draw was met with audible groans from supporters at Old Trafford, highlighting the challenge that lies ahead for United.
2. **Tamworth vs Tottenham**
- Tamworth, one of only two non-league clubs remaining in the competition, will host Premier League side Tottenham Hotspur. This match is particularly significant for Tamworth, the lowest-ranked team still in the cup, as they earned their spot by defeating League One side Burton Albion in a dramatic penalty shootout. Hosting a top-tier team like Spurs is a remarkable achievement and a testament to their determination and skill.
3. **Other Notable Fixtures**
- Premier League leaders Liverpool will host League Two's Accrington Stanley, while Manchester City will welcome Salford City, owned by the 'Class of 92'. These matches highlight the diverse range of clubs participating in the third round.
- Everton's draw against Peterborough presents a unique scenario where Everton full-back Ashley Young, 39, could potentially face his 18-year-old son Tyler, adding a personal narrative to the fixture.
- Chelsea will play against Morecambe, the bottom club of League Two, and Newcastle United will face Bromley, another fourth-tier team, at St James' Park.
**Conclusion**
The FA Cup third round draw has set up a series of intriguing matches that blend the excitement of top-tier clashes with the charm of underdog stories. With Manchester United facing Arsenal and Tamworth hosting Tottenham, fans can look forward to a weekend filled with high-stakes football and potential upsets. This stage of the competition continues to embody the spirit of the FA Cup, where dreams can become reality, and every team has a chance to shine on the national stage.
================================================================================
DETAILED TASK OUTPUTS
================================================================================
Task: Research and analyze information relevant to: What are the key details about the FA Cup third round ...
----------------------------------------
Output: Document 1:
----------------------------------------
Holders Manchester United have been drawn away to record 14-time winners Arsenal in the FA Cup third round.
Premier League leaders Liverpool will host League Two Accrington Stanley, while Manchester City welcome 'Class of 92'-owned Salford City.
Tamworth, one of only two non-league clubs remaining in the competition, are at home to Tottenham.
The third-round ties will be played over the weekend of Saturday, 11 January.
The third round is when the 44 Premier League and Championship clubs enter the competition, joining the 20 lower-league and non-league clubs who made it through last weekend's second-round ties.
There were audible groans from the watching supporters inside Old Trafford as Manchester United, who beat rivals Manchester City to lift the trophy for a 13th time in May, were confirmed as Arsenal's opponents.
Tamworth, the lowest-ranked team remaining in the cup, will host Ange Postecoglou's Spurs as reward for their penalty shootout win against League One side Burton Albion, while fellow National League outfit Dagenham & Redbridge will go to Championship Millwall.
Everton full-back Ashley Young, 39, could face his 18-year-old son Tyler after the Toffees drew Peterborough at home.
"Wow...dreams might come true," posted former England international Young on X.
Elsewhere, Chelsea host League Two's bottom club Morecambe, whose fellow fourth-tier strugglers Bromley travel to face Newcastle United at St James' Park.
----------------------------------------
Task: Create a comprehensive and well-structured response1. The Technical Writer will begin by reviewing t...
----------------------------------------
Output: **FA Cup Third Round Draw: Comprehensive Overview**
**Introduction**
The FA Cup third round draw has set the stage for an exciting series of matches, with notable fixtures featuring both top-tier and lower-league clubs. This stage marks the entry of the 44 Premier League and Championship clubs into the competition, joining 20 lower-league and non-league clubs that have successfully navigated the second round. The matches are scheduled to take place over the weekend of Saturday, 11 January.
**Key Match Summaries**
1. **Manchester United vs Arsenal**
- Holders Manchester United, who recently celebrated their 13th FA Cup victory, have been drawn away to face Arsenal, the record 14-time winners. This fixture promises to be a thrilling encounter, as both teams have a rich history in the competition. The draw was met with audible groans from supporters at Old Trafford, highlighting the challenge that lies ahead for United.
2. **Tamworth vs Tottenham**
- Tamworth, one of only two non-league clubs remaining in the competition, will host Premier League side Tottenham Hotspur. This match is particularly significant for Tamworth, the lowest-ranked team still in the cup, as they earned their spot by defeating League One side Burton Albion in a dramatic penalty shootout. Hosting a top-tier team like Spurs is a remarkable achievement and a testament to their determination and skill.
3. **Other Notable Fixtures**
- Premier League leaders Liverpool will host League Two's Accrington Stanley, while Manchester City will welcome Salford City, owned by the 'Class of 92'. These matches highlight the diverse range of clubs participating in the third round.
- Everton's draw against Peterborough presents a unique scenario where Everton full-back Ashley Young, 39, could potentially face his 18-year-old son Tyler, adding a personal narrative to the fixture.
- Chelsea will play against Morecambe, the bottom club of League Two, and Newcastle United will face Bromley, another fourth-tier team, at St James' Park.
**Conclusion**
The FA Cup third round draw has set up a series of intriguing matches that blend the excitement of top-tier clashes with the charm of underdog stories. With Manchester United facing Arsenal and Tamworth hosting Tottenham, fans can look forward to a weekend filled with high-stakes football and potential upsets. This stage of the competition continues to embody the spirit of the FA Cup, where dreams can become reality, and every team has a chance to shine on the national stage.
----------------------------------------
By following these steps, you've built a powerful RAG system that combines Couchbase's vector storage capabilities with CrewAI's agent-based architecture. This multi-agent approach separates research and writing concerns, resulting in higher quality responses to user queries.
The system demonstrates several key advantages:
Whether you're building a customer support system, a research assistant, or a knowledge management solution, this agent-based RAG approach provides a flexible foundation that can be adapted to various use cases and domains.