In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database and Amazon Bedrock as both the embedding and language model provider. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system using Couchbase Hyperscale and Composite Vector Index from scratch. Alternatively if you want to perform semantic search using the Search Vector Index, please take a look at this.
This tutorial is available as a Jupyter Notebook (.ipynb file) that you can run interactively. You can access the original notebook here.
You can either download the notebook file and run it on Google Colab or run it on your system by setting up the Python environment.
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with an environment where you can explore and learn about Capella with no time constraint.
To know more, please follow the instructions.
Note: To run this tutorial, you will need Capella with Couchbase Server version 8.0 or above as Hyperscale and Composite Vector Index is supported only from version 8.0
When running Couchbase using Capella, the following prerequisites need to be met.
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks.
%pip install --no-user --quiet datasets==3.5.0 langchain-couchbase==1.0.1 langchain-aws boto3 python-dotenv==1.1.0[notice] A new release of pip is available: 25.0.1 -> 26.0
[notice] To update, run: pip install --upgrade pip
Note: you may need to restart the kernel to use updated packages.The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading.
import getpass
import json
import logging
import os
import time
from datetime import timedelta
import boto3
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.exceptions import (CouchbaseException,
InternalServerFailureException)
from couchbase.management.buckets import CreateBucketSettings
from couchbase.options import ClusterOptions
from datasets import load_dataset
from dotenv import load_dotenv
from langchain_aws import BedrockEmbeddings, ChatBedrock
from langchain_core.globals import set_llm_cache
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts.chat import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_couchbase.cache import CouchbaseCache
from langchain_couchbase.vectorstores import CouchbaseQueryVectorStore
from langchain_couchbase.vectorstores import DistanceStrategy
from tqdm import tqdm/Users/kaustavghosh/Desktop/vector-search-cookbook/.venv/lib/python3.12/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
from .autonotebook import tqdm as notebook_tqdmLogging is configured to track the progress of the script and capture any errors or warnings.
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', force=True)
# Suppress excessive logging from libraries
logging.getLogger('httpx').setLevel(logging.WARNING)
logging.getLogger('httpcore').setLevel(logging.WARNING)
logging.getLogger('botocore').setLevel(logging.WARNING)
logging.getLogger('urllib3').setLevel(logging.WARNING)
logging.getLogger('langchain_aws.llms.bedrock').setLevel(logging.WARNING)
logging.getLogger('langchain_aws.embeddings.bedrock').setLevel(logging.WARNING)
logging.getLogger('langchain_aws').setLevel(logging.WARNING)In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like AWS credentials, database credentials, and specific configuration names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.
The project includes an .env.sample file that lists all the environment variables. To get started:
.env file in the same directory as this notebook.env.sample to your .env fileThe script also validates that all required inputs are provided, raising an error if any crucial information is missing. This approach ensures that your integration is both secure and correctly configured without hardcoding sensitive information, enhancing the overall security and maintainability of your code.
# Load environment variables from .env file if it exists
load_dotenv(override=True)
# AWS Credentials
AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY_ID') or input('Enter your AWS Access Key ID: ')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY') or getpass.getpass('Enter your AWS Secret Access Key: ')
AWS_REGION = os.getenv('AWS_REGION') or input('Enter your AWS region (default: us-east-1): ') or 'us-east-1'
# Couchbase Settings
CB_HOST = os.getenv('CB_HOST') or input('Enter your Couchbase host (default: couchbase://localhost): ') or 'couchbase://localhost'
CB_USERNAME = os.getenv('CB_USERNAME') or input('Enter your Couchbase username (default: Administrator): ') or 'Administrator'
CB_PASSWORD = os.getenv('CB_PASSWORD') or getpass.getpass('Enter your Couchbase password (default: password): ') or 'password'
CB_BUCKET_NAME = os.getenv('CB_BUCKET_NAME') or input('Enter your Couchbase bucket name (default: query-vector-search-testing): ') or 'query-vector-search-testing'
SCOPE_NAME = os.getenv('SCOPE_NAME') or input('Enter your scope name (default: shared): ') or 'shared'
COLLECTION_NAME = os.getenv('COLLECTION_NAME') or input('Enter your collection name (default: bedrock): ') or 'bedrock'
CACHE_COLLECTION = os.getenv('CACHE_COLLECTION') or input('Enter your cache collection name (default: cache): ') or 'cache'
# Check if required credentials are set
for cred_name, cred_value in {
'AWS_ACCESS_KEY_ID': AWS_ACCESS_KEY_ID,
'AWS_SECRET_ACCESS_KEY': AWS_SECRET_ACCESS_KEY,
'CB_HOST': CB_HOST,
'CB_USERNAME': CB_USERNAME,
'CB_PASSWORD': CB_PASSWORD,
'CB_BUCKET_NAME': CB_BUCKET_NAME
}.items():
if not cred_value:
raise ValueError(f"{cred_name} is not set")Connecting to a Couchbase cluster is the foundation of our project. Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our semantic search engine. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections. This connection is the gateway through which all data will flow, so ensuring it's set up correctly is paramount.
try:
auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(CB_HOST, options)
cluster.wait_until_ready(timedelta(seconds=5))
logging.info("Successfully connected to Couchbase")
except Exception as e:
raise ConnectionError(f"Failed to connect to Couchbase: {str(e)}")2026-02-05 14:26:01,268 - INFO - Successfully connected to CouchbaseThe setup_collection() function handles creating and configuring the hierarchical data organization in Couchbase:
Bucket Creation:
Scope Management:
Collection Setup:
Additional Tasks:
The function is called twice to set up:
def setup_collection(cluster, bucket_name, scope_name, collection_name):
try:
# Check if bucket exists, create if it doesn't
try:
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' exists.")
except Exception as e:
logging.info(f"Bucket '{bucket_name}' does not exist. Creating it...")
bucket_settings = CreateBucketSettings(
name=bucket_name,
bucket_type='couchbase',
ram_quota_mb=1024,
flush_enabled=True,
num_replicas=0
)
cluster.buckets().create_bucket(bucket_settings)
time.sleep(2)
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' created successfully.")
bucket_manager = bucket.collections()
# Check if scope exists, create if it doesn't
scopes = bucket_manager.get_all_scopes()
scope_exists = any(scope.name == scope_name for scope in scopes)
if not scope_exists and scope_name != "_default":
logging.info(f"Scope '{scope_name}' does not exist. Creating it...")
bucket_manager.create_scope(scope_name)
logging.info(f"Scope '{scope_name}' created successfully.")
scopes = bucket_manager.get_all_scopes()
# Check if collection exists, create if it doesn't
collection_exists = any(
scope.name == scope_name and collection_name in [col.name for col in scope.collections]
for scope in scopes
)
if not collection_exists:
logging.info(f"Collection '{collection_name}' does not exist. Creating it...")
bucket_manager.create_collection(scope_name, collection_name)
logging.info(f"Collection '{collection_name}' created successfully.")
else:
logging.info(f"Collection '{collection_name}' already exists.")
# Wait for collection to be ready
collection = bucket.scope(scope_name).collection(collection_name)
time.sleep(2)
# Create primary index for the collection (required for DELETE operations)
try:
index_keyspace = f"`{bucket_name}`.`{scope_name}`.`{collection_name}`"
cluster.query(f"CREATE PRIMARY INDEX IF NOT EXISTS ON {index_keyspace}").execute()
except Exception:
pass # Index may already exist
# Clear all documents in the collection
try:
query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
cluster.query(query).execute()
logging.info(f"Collection '{collection_name}' cleared.")
except Exception:
pass # Collection might be empty or index not ready
return collection
except Exception as e:
raise RuntimeError(f"Error setting up collection: {str(e)}")
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, CACHE_COLLECTION)2026-02-05 14:26:01,282 - INFO - Bucket 'vector-search-testing' exists.
2026-02-05 14:26:01,284 - INFO - Collection 'bedrock' already exists.
2026-02-05 14:26:03,388 - INFO - Collection 'bedrock' cleared.
2026-02-05 14:26:03,389 - INFO - Bucket 'vector-search-testing' exists.
2026-02-05 14:26:03,391 - INFO - Collection 'cache' already exists.
2026-02-05 14:26:05,399 - INFO - Collection 'cache' cleared.
<couchbase.collection.Collection at 0x11e897b30>Embeddings are at the heart of semantic search. They are numerical representations of text that capture the semantic meaning of the words and phrases. Unlike traditional keyword-based search, which looks for exact matches, embeddings allow our search engine to understand the context and nuances of language, enabling it to retrieve documents that are semantically similar to the query, even if they don't contain the exact keywords. By creating embeddings using Amazon Bedrock's Titan Embedding model, we equip our search engine with the ability to understand and process natural language in a way that's much closer to how humans understand language. This step transforms our raw text data into a format that the search engine can use to find and rank relevant documents.
try:
bedrock_client = boto3.client(
service_name='bedrock-runtime',
region_name=AWS_REGION,
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY
)
embeddings = BedrockEmbeddings(
client=bedrock_client,
model_id="amazon.titan-embed-text-v2:0"
)
logging.info("Successfully created Bedrock embeddings client")
except Exception as e:
raise ValueError(f"Error creating Bedrock embeddings client: {str(e)}")2026-02-05 14:26:05,496 - INFO - Successfully created Bedrock embeddings clientWith Couchbase 8.0+, you can leverage the power of query-based vector search, which offers significant performance improvements over traditional Search Vector Index approaches for vector-first workloads. Hyperscale and Composite Vector Index search provides high-performance vector similarity search with advanced filtering capabilities and is designed to scale to billions of vectors.
| Feature | Hyperscale/Composite Vector Index | Search Vector Index |
|---|---|---|
| Best For | Vector-first workloads, complex filtering, high QPS performance | Hybrid search and high recall rates |
| Couchbase Version | 8.0.0+ | 7.6+ |
| Filtering | Pre-filtering with WHERE clauses (Composite) or post-filtering (Hyperscale) |
Pre-filtering with flexible ordering |
| Scalability | Up to billions of vectors (Hyperscale) | Up to 10 million vectors |
| Performance | Optimized for concurrent operations with low memory footprint | Good for mixed text and vector queries |
Couchbase offers two distinct query-based vector index types, each optimized for different use cases:
In this tutorial, we'll demonstrate creating a Hyperscale index and running vector similarity queries using Hyperscale and Composite Vector Index. Hyperscale is ideal for semantic search scenarios where you want:
The Hyperscale index will provide optimal performance for our Bedrock embedding-based semantic search implementation.
If your use case requires complex filtering with scalar attributes, you may want to consider using a Composite Vector Index instead:
## Alternative: Create a Composite index for filtered searches
vector_store.create_index(
index_type=IndexType.COMPOSITE,
index_description="IVF,SQ8",
distance_metric=DistanceStrategy.COSINE,
index_name="bedrock_composite_index",
)Use Composite indexes when:
Note: Composite indexes enable pre-filtering with scalar attributes, making them ideal for applications where you need to search within specific categories, date ranges, or user-specific data segments.
Before creating our Hyperscale index, it's important to understand the configuration parameters that optimize vector storage and search performance. The index_description parameter controls how Couchbase optimizes vector storage through centroids and quantization.
'IVF[<centroids>],{PQ|SQ}<settings>'IVF,SQ8), Couchbase auto-selects based on dataset sizeScalar Quantization (SQ):
SQ4, SQ6, SQ8 (4, 6, or 8 bits per dimension)Product Quantization (PQ):
PQ<subquantizers>x<bits> (e.g., PQ32x8)IVF,SQ8 - Auto centroids, 8-bit scalar quantization (good default)IVF1000,SQ6 - 1000 centroids, 6-bit scalar quantizationIVF,PQ32x8 - Auto centroids, 32 subquantizers with 8 bitsFor detailed configuration options, see the Quantization & Centroid Settings.
For more information on query-based vector indexes, see Couchbase Vector Index Documentation.
In this tutorial, we use IVF,SQ8 which provides:
A vector store is where we'll keep our embeddings. The query vector store is specifically designed to handle embeddings and perform similarity searches. When a user inputs a query, the query service converts the query into an embedding and compares it against the embeddings stored in the vector store. This allows the engine to find documents that are semantically similar to the query, even if they don't contain the exact same words. By setting up the vector store in Couchbase, we create a powerful tool that enables us to understand and retrieve information based on the meaning and context of the query, rather than just the specific words used.
The vector store requires a distance metric to determine how similarity between vectors is calculated. This is crucial for accurate semantic search results as different distance metrics can yield different similarity rankings. Some of the supported Distance strategies are dot, l2, euclidean, cosine, l2_squared, euclidean_squared. In our implementation we will use cosine which is particularly effective for text embeddings.
try:
vector_store = CouchbaseQueryVectorStore(
cluster=cluster,
bucket_name=CB_BUCKET_NAME,
scope_name=SCOPE_NAME,
collection_name=COLLECTION_NAME,
embedding = embeddings,
distance_metric=DistanceStrategy.COSINE
)
logging.info("Successfully created vector store")
except Exception as e:
raise ValueError(f"Failed to create vector store: {str(e)}")2026-02-05 14:26:05,507 - INFO - Successfully created vector storeTo build a search engine, we need data to search through. We use the BBC News dataset from RealTimeData, which provides real-world news articles. This dataset contains news articles from BBC covering various topics and time periods. Loading the dataset is a crucial step because it provides the raw material that our search engine will work with. The quality and diversity of the news articles make it an excellent choice for testing and refining our search engine, ensuring it can handle real-world news content effectively.
The BBC News dataset allows us to work with authentic news articles, enabling us to build and test a search engine that can effectively process and retrieve relevant news content. The dataset is loaded using the Hugging Face datasets library, specifically accessing the "RealTimeData/bbc_news_alltime" dataset with the "2024-12" version.
try:
news_dataset = load_dataset(
"RealTimeData/bbc_news_alltime", "2024-12", split="train"
)
print(f"Loaded the BBC News dataset with {len(news_dataset)} rows")
logging.info(f"Successfully loaded the BBC News dataset with {len(news_dataset)} rows.")
except Exception as e:
raise ValueError(f"Error loading the BBC News dataset: {str(e)}")2026-02-05 14:26:10,918 - INFO - Successfully loaded the BBC News dataset with 2687 rows.
Loaded the BBC News dataset with 2687 rowsWe will use the content of the news articles for our RAG system.
The dataset contains a few duplicate records. We are removing them to avoid duplicate results in the retrieval stage of our RAG system.
news_articles = news_dataset["content"]
unique_articles = set()
for article in news_articles:
if article:
unique_articles.add(article)
unique_news_articles = list(unique_articles)
print(f"We have {len(unique_news_articles)} unique articles in our database.")We have 1749 unique articles in our database.To efficiently handle the large number of articles, we process them in batches of 50 articles at a time. This batch processing approach helps manage memory usage and provides better control over the ingestion process.
We first filter out any articles that exceed 50,000 characters to avoid potential issues with token limits. Then, using the vector store's add_texts method, we add the filtered articles to our vector database. The batch_size parameter controls how many articles are processed in each iteration.
This approach offers several benefits:
We use a batch size of 50 to ensure reliable operation. The optimal batch size depends on many factors including:
Consider measuring performance with your specific workload before adjusting.
batch_size = 50
# Filter articles within size limits
articles = [article for article in unique_news_articles if article and len(article) <= 50000]
try:
vector_store.add_texts(
texts=articles,
batch_size=batch_size
)
logging.info("Document ingestion completed successfully.")
except Exception as e:
raise ValueError(f"Failed to save documents to vector store: {str(e)}")2026-02-05 14:37:04,351 - INFO - Document ingestion completed successfully.A cache is set up using Couchbase to store intermediate results and frequently accessed data. Caching is important for improving performance, as it reduces the need to repeatedly calculate or retrieve the same data. The cache is linked to a specific collection in Couchbase, and it is used later in the script to store the results of language model queries.
try:
cache = CouchbaseCache(
cluster=cluster,
bucket_name=CB_BUCKET_NAME,
scope_name=SCOPE_NAME,
collection_name=CACHE_COLLECTION,
)
logging.info("Successfully created cache")
set_llm_cache(cache)
except Exception as e:
raise ValueError(f"Failed to create cache: {str(e)}")2026-02-05 14:37:04,361 - INFO - Successfully created cacheAmazon Nova is the next-generation foundation model family from Amazon, replacing the Titan Text models. Nova Pro is the most capable model in the Nova family, designed for complex tasks including:
Key features of Nova Pro:
The model uses a temperature parameter (0-1) to control randomness in responses:
We'll use Nova Pro through Amazon Bedrock's API to process user queries and generate contextually relevant responses based on our vector database content.
Note: The Titan Text models (Premier, Express, Lite) reached Legacy status on January 31, 2025 and will EOL on August 15, 2025. Nova Pro is the recommended replacement for Titan Premier.
try:
llm = ChatBedrock(
client=bedrock_client,
model_id="amazon.nova-pro-v1:0",
model_kwargs={"temperature": 0}
)
logging.info("Successfully created Bedrock LLM client with Nova Pro")
except Exception as e:
logging.error(f"Error creating Bedrock LLM client: {str(e)}. Please check your AWS credentials and Bedrock access.")
raise2026-02-05 14:37:04,469 - INFO - Successfully created Bedrock LLM client with Nova ProSemantic search goes beyond traditional keyword matching by understanding the meaning and context behind queries. Here's how it works in Couchbase:
Vector Embeddings: Documents and queries are converted into high-dimensional vectors using an embeddings model (in our case, Amazon Bedrock's Titan Embedding model)
Similarity Calculation: When a query is made, Couchbase compares the query vector against stored document vectors using the COSINE distance metric
Result Ranking: Documents are ranked by their vector distance (lower distance = more similar meaning)
Flexible Configuration: Different distance metrics (cosine, euclidean, dot product) and embedding models can be used based on your needs
The similarity_search_with_score method performs this entire process, returning documents along with their similarity scores. This enables you to find semantically related content even when exact keywords don't match.
Now let's see semantic search in action and measure its performance with different optimization strategies.
Now let's measure and compare the performance benefits of different optimization strategies. We'll conduct a comprehensive performance analysis across two phases:
Performance Testing Phases:
Important Context:
query = "What were Luke Littler's key achievements and records in his recent PDC World Championship match?"
try:
# Perform the semantic search
start_time = time.time()
search_results = vector_store.similarity_search_with_score(query, k=10)
baseline_time = time.time() - start_time
logging.info(f"Baseline search completed in {baseline_time:.2f} seconds")
# Display search results
print(f"\nBaseline Semantic Search Results (completed in {baseline_time:.2f} seconds):")
print("-" * 80)
for doc, score in search_results:
print(f"Distance: {score:.4f}, Text: {doc.page_content[:200]}...")
print("-" * 80)
except CouchbaseException as e:
raise RuntimeError(f"Error performing semantic search: {str(e)}")
except Exception as e:
raise RuntimeError(f"Unexpected error: {str(e)}")2026-02-05 14:37:05,397 - INFO - Baseline search completed in 0.92 seconds
Baseline Semantic Search Results (completed in 0.92 seconds):
--------------------------------------------------------------------------------
Distance: 0.3512, Text: Luke Littler has risen from 164th to fourth in the rankings in a year
A tearful Luke Littler hit a tournament record 140.91 set average as he started his bid for the PDC World Championship title with...
--------------------------------------------------------------------------------
Distance: 0.4124, Text: The Littler effect - how darts hit the bullseye
Teenager Luke Littler began his bid to win the 2025 PDC World Darts Championship with a second-round win against Ryan Meikle. Here we assess Littler's ...
--------------------------------------------------------------------------------
Distance: 0.4317, Text: Luke Littler is one of six contenders for the 2024 BBC Sports Personality of the Year award.
Here BBC Sport takes a look at the darts player's year in five photos....
--------------------------------------------------------------------------------
Distance: 0.4817, Text: Littler is Young Sports Personality of the Year
This video can not be played To play this video you need to enable JavaScript in your browser.
Darts player Luke Littler has been named BBC Young Spor...
--------------------------------------------------------------------------------
Distance: 0.4823, Text: Wright is the 17th seed at the World Championship
Two-time champion Peter Wright won his opening game at the PDC World Championship, while Ryan Meikle edged out Fallon Sherrock to set up a match agai...
--------------------------------------------------------------------------------
Distance: 0.5302, Text: Luke Littler trends higher than PM on Google in 2024
Luke Littler shot to fame when he became the youngest player to reach the World Darts Championship final in January
Dart sensation Luke Littler h...
--------------------------------------------------------------------------------
Distance: 0.6582, Text: Cross loses as record number of seeds out of Worlds
Rob Cross has suffered three second-round exits in his eight World Championships
Former champion Rob Cross became the latest high-profile casualty...
--------------------------------------------------------------------------------
Distance: 0.6872, Text: Michael van Gerwen has made just one major ranking event final in 2024
Michael van Gerwen enjoyed a comfortable 3-0 victory over English debutant James Hurrell in his opening match of the PDC World D...
--------------------------------------------------------------------------------
Distance: 0.7012, Text: Christian Kist was sealing his first televised nine-darter
Christian Kist hit a nine-darter but lost his PDC World Championship first-round match to Madars Razma. The Dutchman became the first player...
--------------------------------------------------------------------------------
Distance: 0.7059, Text: Gary Anderson was the fifth seed to be beaten on Sunday
Two-time champion Gary Anderson has been dumped out of the PDC World Championship on his 54th birthday by Jeffrey de Graaf. The Scot, winner in...
--------------------------------------------------------------------------------Now that we understand the different index types and configuration options (covered in the "Understanding Hyperscale and Composite Vector Search" section above), let's create a Hyperscale index for our vector store. This method takes an index type (HYPERSCALE or COMPOSITE) and description parameter for optimization settings.
from langchain_couchbase.vectorstores import IndexType
try:
vector_store.create_index(index_type=IndexType.HYPERSCALE, index_name="bedrock_hyperscale_index", index_description="IVF,SQ8")
logging.info("Hyperscale index created successfully")
except Exception as e:
if "already exists" in str(e):
logging.info("Hyperscale index already exists, continuing...")
else:
raise2026-02-05 14:37:10,102 - INFO - Hyperscale index created successfullyNote: To create a COMPOSITE index, the below code can be used. Choose based on your specific use case and query patterns. For this tutorial's news search scenario, either index type would work, but Hyperscale is more efficient for pure semantic search across news articles.
vector_store.create_index(index_type=IndexType.COMPOSITE, index_name="bedrock_composite_index", index_description="IVF,SQ8")
query = "What were Luke Littler's key achievements and records in his recent PDC World Championship match?"
try:
# Perform the semantic search with Hyperscale index
start_time = time.time()
search_results = vector_store.similarity_search_with_score(query, k=10)
hyperscale_time = time.time() - start_time
logging.info(f"Hyperscale search completed in {hyperscale_time:.2f} seconds")
# Display search results
print(f"\nHyperscale Semantic Search Results (completed in {hyperscale_time:.2f} seconds):")
print("-" * 80)
for doc, score in search_results:
print(f"Distance: {score:.4f}, Text: {doc.page_content[:200]}...")
print("-" * 80)
except CouchbaseException as e:
raise RuntimeError(f"Error performing semantic search: {str(e)}")
except Exception as e:
raise RuntimeError(f"Unexpected error: {str(e)}")2026-02-05 14:37:10,453 - INFO - Hyperscale search completed in 0.35 seconds
Hyperscale Semantic Search Results (completed in 0.35 seconds):
--------------------------------------------------------------------------------
Distance: 0.3512, Text: Luke Littler has risen from 164th to fourth in the rankings in a year
A tearful Luke Littler hit a tournament record 140.91 set average as he started his bid for the PDC World Championship title with...
--------------------------------------------------------------------------------
Distance: 0.4124, Text: The Littler effect - how darts hit the bullseye
Teenager Luke Littler began his bid to win the 2025 PDC World Darts Championship with a second-round win against Ryan Meikle. Here we assess Littler's ...
--------------------------------------------------------------------------------
Distance: 0.4317, Text: Luke Littler is one of six contenders for the 2024 BBC Sports Personality of the Year award.
Here BBC Sport takes a look at the darts player's year in five photos....
--------------------------------------------------------------------------------
Distance: 0.4817, Text: Littler is Young Sports Personality of the Year
This video can not be played To play this video you need to enable JavaScript in your browser.
Darts player Luke Littler has been named BBC Young Spor...
--------------------------------------------------------------------------------
Distance: 0.4823, Text: Wright is the 17th seed at the World Championship
Two-time champion Peter Wright won his opening game at the PDC World Championship, while Ryan Meikle edged out Fallon Sherrock to set up a match agai...
--------------------------------------------------------------------------------
Distance: 0.5302, Text: Luke Littler trends higher than PM on Google in 2024
Luke Littler shot to fame when he became the youngest player to reach the World Darts Championship final in January
Dart sensation Luke Littler h...
--------------------------------------------------------------------------------
Distance: 0.6582, Text: Cross loses as record number of seeds out of Worlds
Rob Cross has suffered three second-round exits in his eight World Championships
Former champion Rob Cross became the latest high-profile casualty...
--------------------------------------------------------------------------------
Distance: 0.6872, Text: Michael van Gerwen has made just one major ranking event final in 2024
Michael van Gerwen enjoyed a comfortable 3-0 victory over English debutant James Hurrell in his opening match of the PDC World D...
--------------------------------------------------------------------------------
Distance: 0.7012, Text: Christian Kist was sealing his first televised nine-darter
Christian Kist hit a nine-darter but lost his PDC World Championship first-round match to Madars Razma. The Dutchman became the first player...
--------------------------------------------------------------------------------
Distance: 0.7059, Text: Gary Anderson was the fifth seed to be beaten on Sunday
Two-time champion Gary Anderson has been dumped out of the PDC World Championship on his 54th birthday by Jeffrey de Graaf. The Scot, winner in...
--------------------------------------------------------------------------------Let's analyze the performance improvements we've achieved through different optimization strategies:
print("\n" + "="*60)
print("PERFORMANCE SUMMARY")
print("="*60)
print(f"Baseline Search Time: {baseline_time:.4f} seconds")
if baseline_time and hyperscale_time:
speedup = baseline_time / hyperscale_time if hyperscale_time > 0 else float('inf')
percent_improvement = ((baseline_time - hyperscale_time) / baseline_time) * 100 if baseline_time > 0 else 0
print(f"Hyperscale Search Time: {hyperscale_time:.4f} seconds ({speedup:.2f}x faster, {percent_improvement:.1f}% improvement)")
print("\n" + "-"*60)
print("Index Recommendation:")
print("-"*60)
print("- Hyperscale: Best for pure vector searches, scales to billions of vectors")
print("- Composite: Best for filtered searches combining vector + scalar filters")============================================================
PERFORMANCE SUMMARY
============================================================
Baseline Search Time: 0.9241 seconds
Hyperscale Search Time: 0.3465 seconds (2.67x faster, 62.5% improvement)
------------------------------------------------------------
Index Recommendation:
------------------------------------------------------------
- Hyperscale: Best for pure vector searches, scales to billions of vectors
- Composite: Best for filtered searches combining vector + scalar filtersCouchbase and LangChain can be seamlessly integrated to create RAG (Retrieval-Augmented Generation) chains, enhancing the process of generating contextually relevant responses. In this setup, Couchbase serves as the vector store, where embeddings of documents are stored. When a query is made, LangChain retrieves the most relevant documents from Couchbase by comparing the query's embedding with the stored document embeddings. These documents, which provide contextual information, are then passed to a generative language model within LangChain.
The language model, equipped with the context from the retrieved documents, generates a response that is both informed and contextually accurate. This integration allows the RAG chain to leverage Couchbase's efficient storage and retrieval capabilities, while LangChain handles the generation of responses based on the context provided by the retrieved documents. Together, they create a powerful system that can deliver highly relevant and accurate answers by combining the strengths of both retrieval and generation.
# Create RAG prompt template
rag_prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant that answers questions based on the provided context."),
("human", "Context: {context}\n\nQuestion: {question}")
])
# Create RAG chain
rag_chain = (
{"context": vector_store.as_retriever(), "question": RunnablePassthrough()}
| rag_prompt
| llm
| StrOutputParser()
)
logging.info("Successfully created RAG chain")2026-02-05 14:37:10,463 - INFO - Successfully created RAG chainstart_time = time.time()
# Turn off excessive Logging
logging.basicConfig(level=logging.WARNING, format='%(asctime)s - %(levelname)s - %(message)s', force=True)
try:
rag_response = rag_chain.invoke(query)
rag_elapsed_time = time.time() - start_time
print(f"RAG Response: {rag_response}")
print(f"RAG response generated in {rag_elapsed_time:.2f} seconds")
except InternalServerFailureException as e:
if "query request rejected" in str(e):
print("Error: Search request was rejected due to rate limiting. Please try again later.")
else:
print(f"Internal server error occurred: {str(e)}")
except Exception as e:
print(f"Unexpected error occurred: {str(e)}")RAG Response: In his recent PDC World Championship match, Luke Littler achieved several key milestones:
1. **Tournament Record Average**: Littler set a new tournament record with an average of 140.91 in one set, and an overall average of 100.85.
2. **Dramatic Win**: He secured a dramatic 3-1 victory over Ryan Meikle, overcoming a challenging start and intense nerves.
3. **Nerve-wracking Performance**: Littler was on the verge of achieving a nine-darter but missed double 12, which would have been a perfect score.
4. **Emotional Reaction**: The 17-year-old was overcome with emotion after the match, cutting short his on-stage interview due to tears.
5. **Rankings and Titles**: Littler has risen from 164th to fourth in the world rankings and has won 10 titles in his debut professional year, including the Premier League and Grand Slam of Darts.
For more detailed information, please refer to the provided context documents.
RAG response generated in 2.39 secondsCouchbase can be effectively used as a caching mechanism for RAG (Retrieval-Augmented Generation) responses by storing and retrieving precomputed results for specific queries. This approach enhances the system's efficiency and speed, particularly when dealing with repeated or similar queries. When a query is first processed, the RAG chain retrieves relevant documents, generates a response using the language model, and then stores this response in Couchbase, with the query serving as the key.
For subsequent requests with the same query, the system checks Couchbase first. If a cached response is found, it is retrieved directly from Couchbase, bypassing the need to re-run the entire RAG process. This significantly reduces response time because the computationally expensive steps of document retrieval and response generation are skipped. Couchbase's role in this setup is to provide a fast and scalable storage solution for caching these responses, ensuring that frequently asked queries can be answered more quickly and efficiently.
try:
queries = [
"What happened in the match between Fullham and Liverpool?",
"What were Luke Littler's key achievements and records in his recent PDC World Championship match?",
"What happened in the match between Fullham and Liverpool?", # Repeated query
]
for i, query in enumerate(queries, 1):
print(f"\nQuery {i}: {query}")
start_time = time.time()
response = rag_chain.invoke(query)
elapsed_time = time.time() - start_time
print(f"Response: {response}")
print(f"Time taken: {elapsed_time:.2f} seconds")
except InternalServerFailureException as e:
if "query request rejected" in str(e):
print("Error: Search request was rejected due to rate limiting. Please try again later.")
else:
print(f"Internal server error occurred: {str(e)}")
except Exception as e:
print(f"Unexpected error occurred: {str(e)}")Query 1: What happened in the match between Fullham and Liverpool?
Response: In the match between Fulham and Liverpool, the two teams delivered strong performances, resulting in a 2-2 draw at Anfield. Both teams were praised for their bravery and fighting spirit. Liverpool played the majority of the game with ten men after a red card, yet still managed to secure a draw, which was described as "impressive" by Liverpool's head coach, Arne Slot. The match was part of a series of strong performances by Fulham under manager Marco Silva, who has been rebuilding his reputation and establishing Fulham as a Premier League force.
Time taken: 1.69 seconds
Query 2: What were Luke Littler's key achievements and records in his recent PDC World Championship match?
Response: In his recent PDC World Championship match, Luke Littler achieved several key milestones:
1. **Tournament Record Average**: Littler set a new tournament record with an average of 140.91 in one set, and an overall average of 100.85.
2. **Dramatic Win**: He secured a dramatic 3-1 victory over Ryan Meikle, overcoming a challenging start and intense nerves.
3. **Nerve-wracking Performance**: Littler was on the verge of achieving a nine-darter but missed double 12, which would have been a perfect score.
4. **Emotional Reaction**: The 17-year-old was overcome with emotion after the match, cutting short his on-stage interview due to tears.
5. **Rankings and Titles**: Littler has risen from 164th to fourth in the world rankings and has won 10 titles in his debut professional year, including the Premier League and Grand Slam of Darts.
For more detailed information, please refer to the provided context documents.
Time taken: 0.35 seconds
Query 3: What happened in the match between Fullham and Liverpool?
Response: In the match between Fulham and Liverpool, the two teams delivered strong performances, resulting in a 2-2 draw at Anfield. Both teams were praised for their bravery and fighting spirit. Liverpool played the majority of the game with ten men after a red card, yet still managed to secure a draw, which was described as "impressive" by Liverpool's head coach, Arne Slot. The match was part of a series of strong performances by Fulham under manager Marco Silva, who has been rebuilding his reputation and establishing Fulham as a Premier League force.
Time taken: 0.40 secondsYou've built a high-performance semantic search engine using Couchbase Hyperscale/Composite indexes with Amazon Bedrock and LangChain. For the Search Vector Index alternative, see the search_based tutorial.