In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and CrewAI for agent-based RAG operations. CrewAI allows us to create specialized agents that can work together to handle different aspects of the RAG workflow, from document retrieval to response generation. This tutorial uses Couchbase's Hyperscale or Composite Index vector search capabilities, which offer high-performance vector search optimized for large-scale applications. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch. Alternatively if you want to perform semantic search using the Search Vector Index, please take a look at this.
This tutorial is available as a Jupyter Notebook (.ipynb file) that you can run interactively. You can access the original notebook here.
You can either:
When running Couchbase using Capella, the following prerequisites need to be met:
We'll install the following key libraries:
datasets: For loading and managing our training datalangchain-couchbase: To integrate Couchbase with LangChain for Hyperscale and Composite vector storage and cachinglangchain-openai: For accessing OpenAI's embedding and chat modelscrewai: To create and orchestrate our AI agents for RAG operationspython-dotenv: For securely managing environment variables and API keysThese libraries provide the foundation for building a semantic search engine with Hyperscale and Composite vector embeddings, database integration, and agent-based RAG capabilities.
%pip install --quiet datasets==4.1.0 langchain-couchbase==0.5.0 langchain-openai==0.3.33 crewai==0.186.1 python-dotenv==1.1.1Note: you may need to restart the kernel to use updated packages.The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading.
import getpass
import json
import logging
import os
import time
from datetime import timedelta
from uuid import uuid4
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.diagnostics import PingState, ServiceType
from couchbase.exceptions import (InternalServerFailureException,
QueryIndexAlreadyExistsException,
ServiceUnavailableException,
CouchbaseException)
from couchbase.management.buckets import CreateBucketSettings
from couchbase.options import ClusterOptions
from datasets import load_dataset
from dotenv import load_dotenv
from crewai.tools import tool
from langchain_couchbase.vectorstores import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy, IndexType
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from crewai import Agent, Crew, Process, TaskLogging is configured to track the progress of the script and capture any errors or warnings.
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
datefmt='%Y-%m-%d %H:%M:%S'
)
# Suppress httpx logging
logging.getLogger('httpx').setLevel(logging.CRITICAL)In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like database credentials, and specific configuration names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.
The script uses environment variables to store sensitive information, enhancing the overall security and maintainability of your code by avoiding hardcoded values.
# Load environment variables
load_dotenv("./.env")
# Configuration
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') or input("Enter your OpenAI API key: ")
if not OPENAI_API_KEY:
raise ValueError("OPENAI_API_KEY is not set")
CB_HOST = os.getenv('CB_HOST') or 'couchbase://localhost'
CB_USERNAME = os.getenv('CB_USERNAME') or 'Administrator'
CB_PASSWORD = os.getenv('CB_PASSWORD') or 'password'
CB_BUCKET_NAME = os.getenv('CB_BUCKET_NAME') or 'vector-search-testing'
SCOPE_NAME = os.getenv('SCOPE_NAME') or 'shared'
COLLECTION_NAME = os.getenv('COLLECTION_NAME') or 'crew'
print("Configuration loaded successfully")Configuration loaded successfullyConnecting to a Couchbase cluster is the foundation of our project. Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our semantic search engine. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections. This connection is the gateway through which all data will flow, so ensuring it's set up correctly is paramount.
# Connect to Couchbase
try:
auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(CB_HOST, options)
cluster.wait_until_ready(timedelta(seconds=5))
print("Successfully connected to Couchbase")
except Exception as e:
print(f"Failed to connect to Couchbase: {str(e)}")
raiseSuccessfully connected to CouchbaseCreate and configure Couchbase bucket, scope, and collection for storing our vector data.
Bucket Creation:
Scope Management:
Collection Setup:
Additional Tasks:
The function is called twice to set up:
def setup_collection(cluster, bucket_name, scope_name, collection_name):
try:
# Check if bucket exists, create if it doesn't
try:
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' exists.")
except Exception as e:
logging.info(f"Bucket '{bucket_name}' does not exist. Creating it...")
bucket_settings = CreateBucketSettings(
name=bucket_name,
bucket_type='couchbase',
ram_quota_mb=1024,
flush_enabled=True,
num_replicas=0
)
cluster.buckets().create_bucket(bucket_settings)
time.sleep(2) # Wait for bucket creation to complete and become available
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' created successfully.")
bucket_manager = bucket.collections()
# Check if scope exists, create if it doesn't
scopes = bucket_manager.get_all_scopes()
scope_exists = any(scope.name == scope_name for scope in scopes)
if not scope_exists and scope_name != "_default":
logging.info(f"Scope '{scope_name}' does not exist. Creating it...")
bucket_manager.create_scope(scope_name)
logging.info(f"Scope '{scope_name}' created successfully.")
# Check if collection exists, create if it doesn't
collections = bucket_manager.get_all_scopes()
collection_exists = any(
scope.name == scope_name and collection_name in [col.name for col in scope.collections]
for scope in collections
)
if not collection_exists:
logging.info(f"Collection '{collection_name}' does not exist. Creating it...")
bucket_manager.create_collection(scope_name, collection_name)
logging.info(f"Collection '{collection_name}' created successfully.")
else:
logging.info(f"Collection '{collection_name}' already exists. Skipping creation.")
# Wait for collection to be ready
collection = bucket.scope(scope_name).collection(collection_name)
time.sleep(2) # Give the collection time to be ready for queries
# Clear all documents in the collection
try:
query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
cluster.query(query).execute()
logging.info("All documents cleared from the collection.")
except Exception as e:
logging.warning(f"Error while clearing documents: {str(e)}. The collection might be empty.")
return collection
except Exception as e:
raise RuntimeError(f"Error setting up collection: {str(e)}")
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)2025-10-06 10:17:53 [INFO] Bucket 'vector-search-testing' exists.
2025-10-06 10:17:53 [INFO] Collection 'crew' already exists. Skipping creation.
2025-10-06 10:17:55 [INFO] All documents cleared from the collection.
<couchbase.collection.Collection at 0x307407a10>Semantic search with Hyperscale and Composite Vector Indexes requires creating indexes optimized for vector operations. Unlike Search Vector Index-based vector search, Hyperscale and Composite vector indexes offer two distinct types optimized for different use cases. Learn more about these index types in the Couchbase Vector Index Documentation.
The index_description parameter controls how Couchbase optimizes vector storage and search through centroids and quantization:
Format: 'IVF[<centroids>],{PQ|SQ}<settings>'
Centroids (IVF - Inverted File):
Quantization Options:
Common Examples:
For detailed configuration options, see the Quantization & Centroid Settings.
For more information on Hyperscale and Composite vector indexes, see Couchbase Vector Index Documentation.
# Hyperscale and Composite Vector Index Configuration
# Unlike Search Vector Index, Hyperscale and Composite vector indexes are created programmatically through the vector store
# We'll configure the parameters that will be used for index creation
# Vector configuration
DISTANCE_STRATEGY = DistanceStrategy.COSINE # Cosine similarity
INDEX_TYPE = IndexType.HYPERSCALE # Using HYPERSCALE for high-performance vector search
INDEX_DESCRIPTION = "IVF,SQ8" # Auto-selected centroids with 8-bit scalar quantization
# To create a Composite Index instead, use the following:
# INDEX_TYPE = IndexType.COMPOSITE # Combines vector search with scalar filtering
print("Hyperscale and Composite vector index configuration prepared")Hyperscale and Composite vector index configuration preparedIf your use case requires complex filtering with scalar attributes, you can create a Composite index instead by changing the configuration:
# Alternative configuration for Composite index
INDEX_TYPE = IndexType.COMPOSITE # Instead of IndexType.HYPERSCALE
INDEX_DESCRIPTION = "IVF,SQ8" # Same quantization settings
DISTANCE_STRATEGY = DistanceStrategy.COSINE # Same distance metric
# The rest of the setup remains identicalUse Composite indexes when:
Note: The index creation process is identical - just change the INDEX_TYPE. Composite indexes enable pre-filtering with scalar attributes, making them ideal for applications requiring complex query patterns with metadata filtering.
This section initializes two key OpenAI components needed for our RAG system:
OpenAI Embeddings:
ChatOpenAI Language Model:
Both components require a valid OpenAI API key (OPENAI_API_KEY) for authentication. In the CrewAI framework, the LLM acts as the "brain" for each agent, allowing them to interpret tasks, retrieve relevant information via the RAG system, and generate appropriate outputs based on their specialized roles and expertise.
# Initialize OpenAI components
embeddings = OpenAIEmbeddings(
openai_api_key=OPENAI_API_KEY,
model="text-embedding-3-small"
)
llm = ChatOpenAI(
openai_api_key=OPENAI_API_KEY,
model="gpt-4o",
temperature=0.2
)
print("OpenAI components initialized")OpenAI components initializedSet up the Hyperscale vector store where we'll store document embeddings for high-performance semantic search.
# Setup Hyperscale vector store with OpenAI embeddings
try:
vector_store = CouchbaseQueryVectorStore(
cluster=cluster,
bucket_name=CB_BUCKET_NAME,
scope_name=SCOPE_NAME,
collection_name=COLLECTION_NAME,
embedding=embeddings,
distance_metric=DISTANCE_STRATEGY
)
print("Hyperscale Vector store initialized successfully")
logging.info("Hyperscale Vector store setup completed")
except Exception as e:
logging.error(f"Failed to initialize Hyperscale vector store: {str(e)}")
raise RuntimeError(f"Hyperscale Vector store initialization failed: {str(e)}")2025-10-06 10:18:05 [INFO] Hyperscale Vector store setup completed
Hyperscale Vector store initialized successfullyTo build a search engine, we need data to search through. We use the BBC News dataset from RealTimeData, which provides real-world news articles. This dataset contains news articles from BBC covering various topics and time periods. Loading the dataset is a crucial step because it provides the raw material that our search engine will work with. The quality and diversity of the news articles make it an excellent choice for testing and refining our search engine, ensuring it can handle real-world news content effectively.
The BBC News dataset allows us to work with authentic news articles, enabling us to build and test a search engine that can effectively process and retrieve relevant news content. The dataset is loaded using the Hugging Face datasets library, specifically accessing the "RealTimeData/bbc_news_alltime" dataset with the "2024-12" version.
try:
news_dataset = load_dataset(
"RealTimeData/bbc_news_alltime", "2024-12", split="train"
)
print(f"Loaded the BBC News dataset with {len(news_dataset)} rows")
logging.info(f"Successfully loaded the BBC News dataset with {len(news_dataset)} rows.")
except Exception as e:
raise ValueError(f"Error loading the BBC News dataset: {str(e)}")2025-10-06 10:18:13 [INFO] Successfully loaded the BBC News dataset with 2687 rows.
Loaded the BBC News dataset with 2687 rowsRemove duplicate articles for cleaner search results.
news_articles = news_dataset["content"]
unique_articles = set()
for article in news_articles:
if article:
unique_articles.add(article)
unique_news_articles = list(unique_articles)
print(f"We have {len(unique_news_articles)} unique articles in our database.")We have 1749 unique articles in our database.To efficiently handle the large number of articles, we process them in batches of articles at a time. This batch processing approach helps manage memory usage and provides better control over the ingestion process.
We first filter out any articles that exceed 50,000 characters to avoid potential issues with token limits. Then, using the vector store's add_texts method, we add the filtered articles to our vector database. The batch_size parameter controls how many articles are processed in each iteration.
This approach offers several benefits:
We use a conservative batch size of 50 to ensure reliable operation. The optimal batch size depends on many factors including document sizes, available system resources, network conditions, and concurrent workload.
batch_size = 50
# Automatic Batch Processing
articles = [article for article in unique_news_articles if article and len(article) <= 50000]
try:
vector_store.add_texts(
texts=articles,
batch_size=batch_size
)
logging.info("Document ingestion completed successfully.")
except Exception as e:
raise ValueError(f"Failed to save documents to vector store: {str(e)}")2025-10-06 10:19:43 [INFO] Document ingestion completed successfully.Now let's demonstrate the performance benefits of Hyperscale vector index optimization by testing pure vector search performance. We'll compare three optimization levels:
Important: This testing focuses on pure vector search performance, isolating the Hyperscale vector index improvements from other workflow overhead.
import time
# Create Hyperscale vector retriever optimized for high-performance searches
retriever = vector_store.as_retriever(
search_type="similarity",
search_kwargs={"k": 4} # Return top 4 most similar documents
)
def test_vector_search_performance(query_text, label="Vector Search"):
"""Test pure vector search performance and return timing metrics"""
print(f"\n[{label}] Testing vector search performance")
print(f"[{label}] Query: '{query_text}'")
start_time = time.time()
try:
# Perform vector search using the retriever
docs = retriever.invoke(query_text)
end_time = time.time()
search_time = end_time - start_time
print(f"[{label}] Vector search completed in {search_time:.4f} seconds")
print(f"[{label}] Found {len(docs)} relevant documents")
# Show a preview of the first result
if docs:
preview = docs[0].page_content[:100] + "..." if len(docs[0].page_content) > 100 else docs[0].page_content
print(f"[{label}] Top result preview: {preview}")
return search_time
except Exception as e:
print(f"[{label}] Vector search failed: {str(e)}")
return NoneTest pure vector search performance without Hyperscale vector index optimization.
# Test baseline vector search performance without Hyperscale vector index
test_query = "What are the latest developments in football transfers?"
print("Testing baseline vector search performance without Hyperscale vector index optimization...")
baseline_time = test_vector_search_performance(test_query, "Baseline Search")
print(f"\nBaseline vector search time (without Hyperscale vector index): {baseline_time:.4f} seconds\n")Testing baseline vector search performance without Hyperscale vector index optimization...
[Baseline Search] Testing vector search performance
[Baseline Search] Query: 'What are the latest developments in football transfers?'
[Baseline Search] Vector search completed in 1.3999 seconds
[Baseline Search] Found 4 relevant documents
[Baseline Search] Top result preview: The latest updates and analysis from the BBC.
Baseline vector search time (without Hyperscale vector index): 1.3999 secondsNow let's create a Hyperscale vector index to enable high-performance vector searches. The index creation is done programmatically through the vector store, which will optimize the index settings based on our data and requirements.
# Create Hyperscale Vector Index for high-performance searches
print("Creating Hyperscale vector index...")
try:
# Create a Hyperscale index optimized for pure vector searches
vector_store.create_index(
index_type=INDEX_TYPE, # Hyperscale index type
index_description=INDEX_DESCRIPTION # IVF,SQ8 for optimized performance
)
print(f"Hyperscale Vector index created successfully")
logging.info(f"Hyperscale index created with description '{INDEX_DESCRIPTION}'")
# Wait a moment for index to be available
print("Waiting for index to become available...")
time.sleep(5)
except Exception as e:
# Index might already exist, which is fine
if "already exists" in str(e).lower():
print(f"Hyperscale Vector index already exists, proceeding...")
logging.info(f"Index already exists")
else:
logging.error(f"Failed to create Hyperscale vector index: {str(e)}")
raise RuntimeError(f"Hyperscale vector index creation failed: {str(e)}")Creating Hyperscale vector index...
2025-10-06 10:20:15 [INFO] Hyperscale index created with description 'IVF,SQ8'
Hyperscale Vector index created successfully
Waiting for index to become available...Test the same vector search with Hyperscale vector index optimization.
# Test vector search performance with Hyperscale vector index
print("Testing vector search performance with Hyperscale vector index optimization...")
hyperscale_search_time = test_vector_search_performance(test_query, "Hyperscale-Optimized Search")Testing vector search performance with Hyperscale vector index optimization...
[Hyperscale-Optimized Search] Testing vector search performance
[Hyperscale-Optimized Search] Query: 'What are the latest developments in football transfers?'
[Hyperscale-Optimized Search] Vector search completed in 0.5885 seconds
[Hyperscale-Optimized Search] Found 4 relevant documents
[Hyperscale-Optimized Search] Top result preview: Four key areas for Everton's new owners to address
Everton fans last saw silverware in 1995 when th...Now let's demonstrate how caching can improve performance for repeated queries. Note: Caching benefits apply to both baseline and Hyperscale-optimized searches.
# Test cache benefits with a different query to avoid interference
cache_test_query = "What happened in the latest Premier League matches?"
print("Testing cache benefits with vector search...")
print("First execution (cache miss):")
cache_time_1 = test_vector_search_performance(cache_test_query, "Cache Test - First Run")
print("\nSecond execution (cache hit - should be faster):")
cache_time_2 = test_vector_search_performance(cache_test_query, "Cache Test - Second Run")Testing cache benefits with vector search...
First execution (cache miss):
[Cache Test - First Run] Testing vector search performance
[Cache Test - First Run] Query: 'What happened in the latest Premier League matches?'
[Cache Test - First Run] Vector search completed in 0.6450 seconds
[Cache Test - First Run] Found 4 relevant documents
[Cache Test - First Run] Top result preview: Who has made Troy's Premier League team of the week?
After every round of Premier League matches th...
Second execution (cache hit - should be faster):
[Cache Test - Second Run] Testing vector search performance
[Cache Test - Second Run] Query: 'What happened in the latest Premier League matches?'
[Cache Test - Second Run] Vector search completed in 0.4306 seconds
[Cache Test - Second Run] Found 4 relevant documents
[Cache Test - Second Run] Top result preview: Who has made Troy's Premier League team of the week?
After every round of Premier League matches th...Let's analyze the vector search performance improvements across all optimization levels:
print("\n" + "="*80)
print("VECTOR SEARCH PERFORMANCE OPTIMIZATION SUMMARY")
print("="*80)
print(f"Phase 1 - Baseline Search (No Hyperscale): {baseline_time:.4f} seconds")
print(f"Phase 2 - Hyperscale-Optimized Search: {hyperscale_search_time:.4f} seconds")
if cache_time_1 and cache_time_2:
print(f"Phase 3 - Cache Benefits:")
print(f" First execution (cache miss): {cache_time_1:.4f} seconds")
print(f" Second execution (cache hit): {cache_time_2:.4f} seconds")
print("\n" + "-"*80)
print("VECTOR SEARCH OPTIMIZATION IMPACT:")
print("-"*80)
# Hyperscale improvement analysis
if baseline_time and hyperscale_search_time:
speedup = baseline_time / hyperscale_search_time if hyperscale_search_time > 0 else float('inf')
time_saved = baseline_time - hyperscale_search_time
percent_improvement = (time_saved / baseline_time) * 100
print(f"Hyperscale Index Benefit: {speedup:.2f}x faster ({percent_improvement:.1f}% improvement)")
# Cache improvement analysis
if cache_time_1 and cache_time_2 and cache_time_2 < cache_time_1:
cache_speedup = cache_time_1 / cache_time_2
cache_improvement = ((cache_time_1 - cache_time_2) / cache_time_1) * 100
print(f"Cache Benefit: {cache_speedup:.2f}x faster ({cache_improvement:.1f}% improvement)")
else:
print(f"Cache Benefit: Variable (depends on query complexity and caching mechanism)")
print(f"\nKey Insights for Vector Search Performance:")
print(f"• Hyperscale indexes provide significant performance improvements for vector similarity search")
print(f"• Performance gains are most dramatic for complex semantic queries")
print(f"• Hyperscale optimization is particularly effective for high-dimensional embeddings")
print(f"• Combined with proper quantization (SQ8), Hyperscale vector indexes deliver production-ready performance")
print(f"• These performance improvements directly benefit any application using the vector store")================================================================================
VECTOR SEARCH PERFORMANCE OPTIMIZATION SUMMARY
================================================================================
Phase 1 - Baseline Search (No Hyperscale): 1.3999 seconds
Phase 2 - Hyperscale-Optimized Search: 0.5885 seconds
Phase 3 - Cache Benefits:
First execution (cache miss): 0.6450 seconds
Second execution (cache hit): 0.4306 seconds
--------------------------------------------------------------------------------
VECTOR SEARCH OPTIMIZATION IMPACT:
--------------------------------------------------------------------------------
Hyperscale Index Benefit: 2.38x faster (58.0% improvement)
Cache Benefit: 1.50x faster (33.2% improvement)
Key Insights for Vector Search Performance:
• Hyperscale indexes provide significant performance improvements for vector similarity search
• Performance gains are most dramatic for complex semantic queries
• Hyperscale optimization is particularly effective for high-dimensional embeddings
• Combined with proper quantization (SQ8), Hyperscale delivers production-ready performance
• These performance improvements directly benefit any application using the vector storeNow that we've optimized our vector search performance, let's build a sophisticated agent-based RAG system using CrewAI. CrewAI enables us to create specialized AI agents that collaborate to handle different aspects of the RAG workflow:
This multi-agent approach produces higher-quality responses than single-agent systems by separating research and writing expertise, while benefiting from the Hyperscale vector index performance improvements we just demonstrated.
# Define the Hyperscale vector search tool using the @tool decorator
@tool("hyperscale_vector_search")
def search_tool(query: str) -> str:
"""Search for relevant documents using Hyperscale vector similarity.
Input should be a simple text query string.
Returns a list of relevant document contents from Hyperscale vector search.
Use this tool to find detailed information about topics using high-performance Hyperscale indexes."""
# Invoke the Hyperscale vector retriever (now optimized with HYPERSCALE index)
docs = retriever.invoke(query)
# Format the results with distance information
formatted_docs = "\n\n".join([
f"Document {i+1}:\n{'-'*40}\n{doc.page_content}"
for i, doc in enumerate(docs)
])
return formatted_docs# Create research agent
researcher = Agent(
role='Research Expert',
goal='Find and analyze the most relevant documents to answer user queries accurately',
backstory="""You are an expert researcher with deep knowledge in information retrieval
and analysis. Your expertise lies in finding, evaluating, and synthesizing information
from various sources. You have a keen eye for detail and can identify key insights
from complex documents. You always verify information across multiple sources and
provide comprehensive, accurate analyses.""",
tools=[search_tool],
llm=llm,
verbose=False,
memory=True,
allow_delegation=False
)
# Create writer agent
writer = Agent(
role='Technical Writer',
goal='Generate clear, accurate, and well-structured responses based on research findings',
backstory="""You are a skilled technical writer with expertise in making complex
information accessible and engaging. You excel at organizing information logically,
explaining technical concepts clearly, and creating well-structured documents. You
ensure all information is properly cited, accurate, and presented in a user-friendly
manner. You have a talent for maintaining the reader's interest while conveying
detailed technical information.""",
llm=llm,
verbose=False,
memory=True,
allow_delegation=False
)
print("CrewAI agents created successfully with optimized Hyperscale vector search")CrewAI agents created successfully with optimized Hyperscale vector searchThe complete optimized RAG process:
Key Benefit: The vector search performance improvements we demonstrated directly enhance the agent workflow efficiency.
Now let's demonstrate the complete optimized agent-based RAG system in action, benefiting from the Hyperscale vector index performance improvements we validated earlier.
def process_interactive_query(query, researcher, writer):
"""Run complete RAG workflow with CrewAI agents using optimized Hyperscale vector search"""
print(f"\nProcessing Query: {query}")
print("=" * 80)
# Create tasks
research_task = Task(
description=f"Research and analyze information relevant to: {query}",
agent=researcher,
expected_output="A detailed analysis with key findings"
)
writing_task = Task(
description="Create a comprehensive response",
agent=writer,
expected_output="A clear, well-structured answer",
context=[research_task]
)
# Execute crew
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, writing_task],
process=Process.sequential,
verbose=True,
cache=True,
planning=True
)
try:
start_time = time.time()
result = crew.kickoff()
elapsed_time = time.time() - start_time
print(f"\nCompleted in {elapsed_time:.2f} seconds")
print("=" * 80)
print("RESPONSE")
print("=" * 80)
print(result)
return elapsed_time
except Exception as e:
print(f"Error: {str(e)}")
return None# Disable logging for cleaner output
logging.disable(logging.CRITICAL)
# Run demo with a sample query
demo_query = "What are the key details about the FA Cup third round draw?"
final_time = process_interactive_query(demo_query, researcher, writer)
if final_time:
print(f"\n\n✅ CrewAI agent demo completed successfully in {final_time:.2f} seconds")You have successfully built a powerful agent-based RAG system that combines Couchbase's high-performance Hyperscale and Composite vector storage capabilities with CrewAI's multi-agent architecture. This tutorial demonstrated the complete pipeline from data ingestion to intelligent response generation, with real performance benchmarks showing the dramatic improvements Hyperscale vector indexing provides.