In this guide, we will walk you through building a powerful semantic search engine using Couchbase as the backend database, OpenAI as the embedding and LLM provider, and PydanticAI as an agent orchestrator. Semantic search goes beyond simple keyword matching by understanding the context and meaning behind the words in a query, making it an essential tool for applications that require intelligent information retrieval. This tutorial is designed to be beginner-friendly, with clear, step-by-step instructions that will equip you with the knowledge to create a fully functional semantic search system from scratch.
This tutorial is available as a Jupyter Notebook (.ipynb
file) that you can run interactively.
You can either download the notebook file and run it on Google Colab or run it on your system by setting up the Python environment.
To get started with Couchbase Capella, create an account and use it to deploy a forever free tier operational cluster. This account provides you with an environment where you can explore and learn about Capella with no time constraint.
To know more, please follow the instructions.
When running Couchbase using Capella, the following prerequisites need to be met.
To build our semantic search engine, we need a robust set of tools. The libraries we install handle everything from connecting to databases to performing complex machine learning tasks. Each library has a specific role: Couchbase libraries manage database operations, LangChain handles AI model integrations, and OpenAI provides advanced AI models for generating embeddings and understanding natural language. By setting up these libraries, we ensure our environment is equipped to handle the data-intensive and computationally complex tasks required for semantic search.
%pip install --quiet -U datasets==3.5.0 langchain-couchbase==0.3.0 langchain-openai==0.3.13 python-dotenv==1.1.0 pydantic-ai==0.1.1 ipywidgets==8.1.6
Note: you may need to restart the kernel to use updated packages.
The script starts by importing a series of libraries required for various tasks, including handling JSON, logging, time tracking, Couchbase connections, embedding generation, and dataset loading. These libraries provide essential functions for working with data, managing database connections, and processing machine learning models.
import getpass
import json
import logging
import os
import time
from uuid import uuid4
from datetime import timedelta
from couchbase.auth import PasswordAuthenticator
from couchbase.cluster import Cluster
from couchbase.exceptions import (InternalServerFailureException,
QueryIndexAlreadyExistsException)
from couchbase.management.buckets import CreateBucketSettings
from couchbase.management.search import SearchIndex
from couchbase.options import ClusterOptions
from datasets import load_dataset
from dotenv import load_dotenv
from langchain_couchbase.vectorstores import CouchbaseSearchVectorStore
from langchain_openai import OpenAIEmbeddings
from tqdm import tqdm
from dataclasses import dataclass
from pydantic_ai import Agent, RunContext
Logging is configured to track the progress of the script and capture any errors or warnings. This is crucial for debugging and understanding the flow of execution. The logging output includes timestamps, log levels (e.g., INFO, ERROR), and messages that describe what is happening in the script.
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s', force=True)
In this section, we prompt the user to input essential configuration settings needed. These settings include sensitive information like API keys, database credentials, and specific configuration names. Instead of hardcoding these details into the script, we request the user to provide them at runtime, ensuring flexibility and security.
The script also validates that all required inputs are provided, raising an error if any crucial information is missing. This approach ensures that your integration is both secure and correctly configured without hardcoding sensitive information, enhancing the overall security and maintainability of your code.
load_dotenv()
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY') or getpass.getpass('Enter your OpenAI API Key: ')
CB_HOST = os.getenv('CB_HOST') or input('Enter your Couchbase host (default: couchbase://localhost): ') or 'couchbase://localhost'
CB_USERNAME = os.getenv('CB_USERNAME') or input('Enter your Couchbase username (default: Administrator): ') or 'Administrator'
CB_PASSWORD = os.getenv('CB_PASSWORD') or getpass.getpass('Enter your Couchbase password (default: password): ') or 'password'
CB_BUCKET_NAME = os.getenv('CB_BUCKET_NAME') or input('Enter your Couchbase bucket name (default: vector-search-testing): ') or 'vector-search-testing'
INDEX_NAME = os.getenv('INDEX_NAME') or input('Enter your index name (default: vector_search_pydantic_ai): ') or 'vector_search_pydantic_ai'
SCOPE_NAME = os.getenv('SCOPE_NAME') or input('Enter your scope name (default: shared): ') or 'shared'
COLLECTION_NAME = os.getenv('COLLECTION_NAME') or input('Enter your collection name (default: pydantic_ai): ') or 'pydantic_ai'
# Check if the variables are correctly loaded
if not OPENAI_API_KEY:
raise ValueError("Missing OpenAI API Key")
if 'OPENAI_API_KEY' not in os.environ:
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
Connecting to a Couchbase cluster is the foundation of our project. Couchbase will serve as our primary data store, handling all the storage and retrieval operations required for our semantic search engine. By establishing this connection, we enable our application to interact with the database, allowing us to perform operations such as storing embeddings, querying data, and managing collections. This connection is the gateway through which all data will flow, so ensuring it's set up correctly is paramount.
try:
auth = PasswordAuthenticator(CB_USERNAME, CB_PASSWORD)
options = ClusterOptions(auth)
cluster = Cluster(CB_HOST, options)
cluster.wait_until_ready(timedelta(seconds=5))
logging.info("Successfully connected to Couchbase")
except Exception as e:
raise ConnectionError(f"Failed to connect to Couchbase: {str(e)}")
2025-04-11 13:54:19,537 - INFO - Successfully connected to Couchbase
The setup_collection() function handles creating and configuring the hierarchical data organization in Couchbase:
Bucket Creation:
Scope Management:
Collection Setup:
Additional Tasks:
def setup_collection(cluster, bucket_name, scope_name, collection_name):
try:
# Check if bucket exists, create if it doesn't
try:
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' exists.")
except Exception as e:
logging.info(f"Bucket '{bucket_name}' does not exist. Creating it...")
bucket_settings = CreateBucketSettings(
name=bucket_name,
bucket_type='couchbase',
ram_quota_mb=1024,
flush_enabled=True,
num_replicas=0
)
cluster.buckets().create_bucket(bucket_settings)
time.sleep(2) # Wait for bucket creation to complete and become available
bucket = cluster.bucket(bucket_name)
logging.info(f"Bucket '{bucket_name}' created successfully.")
bucket_manager = bucket.collections()
# Check if scope exists, create if it doesn't
scopes = bucket_manager.get_all_scopes()
scope_exists = any(scope.name == scope_name for scope in scopes)
if not scope_exists and scope_name != "_default":
logging.info(f"Scope '{scope_name}' does not exist. Creating it...")
bucket_manager.create_scope(scope_name)
logging.info(f"Scope '{scope_name}' created successfully.")
# Check if collection exists, create if it doesn't
collections = bucket_manager.get_all_scopes()
collection_exists = any(
scope.name == scope_name and collection_name in [col.name for col in scope.collections]
for scope in collections
)
if not collection_exists:
logging.info(f"Collection '{collection_name}' does not exist. Creating it...")
bucket_manager.create_collection(scope_name, collection_name)
time.sleep(2)
logging.info(f"Collection '{collection_name}' created successfully.")
else:
logging.info(f"Collection '{collection_name}' already exists.Skipping creation.")
collection = bucket.scope(scope_name).collection(collection_name)
time.sleep(2) # Give the collection time to be ready for queries
# Ensure primary index exists
try:
cluster.query(f"CREATE PRIMARY INDEX IF NOT EXISTS ON `{bucket_name}`.`{scope_name}`.`{collection_name}`").execute()
logging.info("Primary index present or created successfully.")
except Exception as e:
logging.warning(f"Error creating primary index: {str(e)}")
# Clear all documents in the collection
try:
query = f"DELETE FROM `{bucket_name}`.`{scope_name}`.`{collection_name}`"
cluster.query(query).execute()
logging.info("All documents cleared from the collection.")
except Exception as e:
logging.warning(f"Error while clearing documents: {str(e)}. The collection might be empty.")
return collection
except Exception as e:
raise RuntimeError(f"Error setting up collection: {str(e)}")
setup_collection(cluster, CB_BUCKET_NAME, SCOPE_NAME, COLLECTION_NAME)
2025-04-11 13:54:23,668 - INFO - Bucket 'vector-search-testing' does not exist. Creating it...
2025-04-11 13:54:25,721 - INFO - Bucket 'vector-search-testing' created successfully.
2025-04-11 13:54:25,728 - INFO - Scope 'shared' does not exist. Creating it...
2025-04-11 13:54:25,777 - INFO - Scope 'shared' created successfully.
2025-04-11 13:54:25,796 - INFO - Collection 'pydantic_ai' does not exist. Creating it...
2025-04-11 13:54:27,843 - INFO - Collection 'pydantic_ai' created successfully.
2025-04-11 13:54:28,120 - INFO - Primary index present or created successfully.
2025-04-11 13:54:28,133 - INFO - All documents cleared from the collection.
<couchbase.collection.Collection at 0x16febe640>
Semantic search requires an efficient way to retrieve relevant documents based on a user's query. This is where the Couchbase Vector Search Index comes into play. In this step, we load the Vector Search Index definition from a JSON file, which specifies how the index should be structured. This includes the fields to be indexed, the dimensions of the vectors, and other parameters that determine how the search engine processes queries based on vector similarity.
This vector search index configuration requires specific default settings to function properly. This tutorial uses the bucket named vector-search-testing
with the scope shared
and collection pydantic_ai
. The configuration is set up for vectors with exactly 1536 dimensions
, using dot product similarity and optimized for recall. If you want to use a different bucket, scope, or collection, you will need to modify the index configuration accordingly.
For more information on creating a vector search index, please follow the instructions.
# If you are running this script locally (not in Google Colab), uncomment the following line
# and provide the path to your index definition file.
# index_definition_path = '/path_to_your_index_file/pydantic_ai_index.json' # Local setup: specify your file path here
# # Version for Google Colab
# def load_index_definition_colab():
# from google.colab import files
# print("Upload your index definition file")
# uploaded = files.upload()
# index_definition_path = list(uploaded.keys())[0]
# try:
# with open(index_definition_path, 'r') as file:
# index_definition = json.load(file)
# return index_definition
# except Exception as e:
# raise ValueError(f"Error loading index definition from {index_definition_path}: {str(e)}")
# Version for Local Environment
def load_index_definition_local(index_definition_path):
try:
with open(index_definition_path, 'r') as file:
index_definition = json.load(file)
return index_definition
except Exception as e:
raise ValueError(f"Error loading index definition from {index_definition_path}: {str(e)}")
# Usage
# Uncomment the appropriate line based on your environment
# index_definition = load_index_definition_colab()
index_definition = load_index_definition_local('pydantic_ai_index.json')
With the index definition loaded, the next step is to create or update the Vector Search Index in Couchbase. This step is crucial because it optimizes our database for vector similarity search operations, allowing us to perform searches based on the semantic content of documents rather than just keywords. By creating or updating a Vector Search Index, we enable our search engine to handle complex queries that involve finding semantically similar documents using vector embeddings, which is essential for a robust semantic search engine.
try:
scope_index_manager = cluster.bucket(CB_BUCKET_NAME).scope(SCOPE_NAME).search_indexes()
# Check if index already exists
existing_indexes = scope_index_manager.get_all_indexes()
index_name = index_definition["name"]
if index_name in [index.name for index in existing_indexes]:
logging.info(f"Index '{index_name}' found")
else:
logging.info(f"Creating new index '{index_name}'...")
# Create SearchIndex object from JSON definition
search_index = SearchIndex.from_json(index_definition)
# Upsert the index (create if not exists, update if exists)
scope_index_manager.upsert_index(search_index)
logging.info(f"Index '{index_name}' successfully created/updated.")
except QueryIndexAlreadyExistsException:
logging.info(f"Index '{index_name}' already exists. Skipping creation/update.")
except InternalServerFailureException as e:
error_message = str(e)
logging.error(f"InternalServerFailureException raised: {error_message}")
try:
# Accessing the response_body attribute from the context
error_context = e.context
response_body = error_context.response_body
if response_body:
error_details = json.loads(response_body)
error_message = error_details.get('error', '')
if "collection: 'pydantic_ai' doesn't belong to scope: 'shared'" in error_message:
raise ValueError("Collection 'pydantic_ai' does not belong to scope 'shared'. Please check the collection and scope names.")
except ValueError as ve:
logging.error(str(ve))
raise
except Exception as json_error:
logging.error(f"Failed to parse the error message: {json_error}")
raise RuntimeError(f"Internal server error while creating/updating search index: {error_message}")
2025-04-11 13:54:41,157 - INFO - Creating new index 'vector-search-testing.shared.vector_search_pydantic_ai'...
2025-04-11 13:54:41,316 - INFO - Index 'vector-search-testing.shared.vector_search_pydantic_ai' successfully created/updated.
Embeddings are at the heart of semantic search. They are numerical representations of text that capture the semantic meaning of the words and phrases. Unlike traditional keyword-based search, which looks for exact matches, embeddings allow our search engine to understand the context and nuances of language, enabling it to retrieve documents that are semantically similar to the query, even if they don't contain the exact keywords. By creating embeddings using OpenAI, we equip our search engine with the ability to understand and process natural language in a way that's much closer to how humans understand language. This step transforms our raw text data into a format that the search engine can use to find and rank relevant documents.
try:
embeddings = OpenAIEmbeddings(
model="text-embedding-3-small",
api_key=OPENAI_API_KEY,
)
logging.info("Successfully created OpenAIEmbeddings")
except Exception as e:
raise ValueError(f"Error creating OpenAIEmbeddings: {str(e)}")
2025-04-11 13:55:10,426 - INFO - Successfully created OpenAIEmbeddings
The vector store is set up to manage the embeddings created in the previous step. The vector store is essentially a database optimized for storing and retrieving high-dimensional vectors. In this case, the vector store is built on top of Couchbase, allowing the script to store the embeddings in a way that can be efficiently searched.
try:
vector_store = CouchbaseSearchVectorStore(
cluster=cluster,
bucket_name=CB_BUCKET_NAME,
scope_name=SCOPE_NAME,
collection_name=COLLECTION_NAME,
embedding=embeddings,
index_name=INDEX_NAME,
)
logging.info("Successfully created vector store")
except Exception as e:
raise ValueError(f"Failed to create vector store: {str(e)}")
2025-04-11 13:55:12,849 - INFO - Successfully created vector store
To build a search engine, we need data to search through. We use the BBC News dataset from RealTimeData, which provides real-world news articles. This dataset contains news articles from BBC covering various topics and time periods. Loading the dataset is a crucial step because it provides the raw material that our search engine will work with. The quality and diversity of the news articles make it an excellent choice for testing and refining our search engine, ensuring it can handle real-world news content effectively.
The BBC News dataset allows us to work with authentic news articles, enabling us to build and test a search engine that can effectively process and retrieve relevant news content. The dataset is loaded using the Hugging Face datasets library, specifically accessing the "RealTimeData/bbc_news_alltime" dataset with the "2024-12" version.
try:
news_dataset = load_dataset(
"RealTimeData/bbc_news_alltime", "2024-12", split="train"
)
print(f"Loaded the BBC News dataset with {len(news_dataset)} rows")
logging.info(f"Successfully loaded the BBC News dataset with {len(news_dataset)} rows.")
except Exception as e:
raise ValueError(f"Error loading the BBC News dataset: {str(e)}")
2025-04-11 13:55:22,967 - INFO - Successfully loaded the BBC News dataset with 2687 rows.
Loaded the BBC News dataset with 2687 rows
We will use the content of the news articles for our RAG system.
The dataset contains a few duplicate records. We are removing them to avoid duplicate results in the retrieval stage of our RAG system.
news_articles = news_dataset["content"]
unique_articles = set()
for article in news_articles:
if article:
unique_articles.add(article)
unique_news_articles = list(unique_articles)
print(f"We have {len(unique_news_articles)} unique articles in our database.")
We have 1749 unique articles in our database.
With the Vector store set up, the next step is to populate it with data. We save the BBC articles dataset to the vector store. For each document, we will generate the embeddings for the article to use with the semantic search using LangChain. Here one of the articles is larger than the maximum tokens that we can use for our embedding model. If we want to ingest that document, we could split the document and ingest it in parts. However, since it is only a single document for simplicity, we ignore that document from the ingestion process.
# Save the current logging level
current_logging_level = logging.getLogger().getEffectiveLevel()
# # Set logging level to CRITICAL to suppress lower level logs
logging.getLogger().setLevel(logging.CRITICAL)
articles = [article for article in unique_news_articles if article and len(article) <= 50000]
try:
vector_store.add_texts(
texts=articles
)
except Exception as e:
raise ValueError(f"Failed to save documents to vector store: {str(e)}")
# Restore the original logging level
logging.getLogger().setLevel(current_logging_level)
From PydanticAI's website:
PydanticAI is a Python agent framework designed to make it less painful to build production grade applications with Generative AI.
PydanticAI allows us to define agents and tools easily to create Gen-AI apps in an innovative and painless manner. Some of its features are:
Built by the Pydantic Team: Built by the team behind Pydantic (the validation layer of the OpenAI SDK, the Anthropic SDK, LangChain, LlamaIndex, AutoGPT, Transformers, CrewAI, Instructor and many more).
Model-agnostic: Supports OpenAI, Anthropic, Gemini, Deepseek, Ollama, Groq, Cohere, and Mistral, and there is a simple interface to implement support for other models.
Type-safe: Designed to make type checking as powerful and informative as possible for you.
Python-centric Design: Leverages Python's familiar control flow and agent composition to build your AI-driven projects, making it easy to apply standard Python best practices you'd use in any other (non-AI) project.
Structured Responses: Harnesses the power of Pydantic to validate and structure model outputs, ensuring responses are consistent across runs.
Dependency Injection System: Offers an optional dependency injection system to provide data and services to your agent's system prompts, tools and result validators. This is useful for testing and eval-driven iterative development.
Streamed Responses: Provides the ability to stream LLM outputs continuously, with immediate validation, ensuring rapid and accurate results.
Graph Support: Pydantic Graph provides a powerful way to define graphs using typing hints, this is useful in complex applications where standard control flow can degrade to spaghetti code.
PydanticAI makes heavy use of dependency injection to provide data and services to your agent's system prompts and tools. We define dependencies using a dataclass
, which serves as a container for our dependencies.
In our case, the only dependency for our agent to work in the CouchbaseSearchVectorStore
instance. However, we will still use a dataclass
as it is good practice. In the future, in case we wish to add more dependencies, we can just add more fields to the dataclass
Deps
.
We also initialize an agent as a GPT-4o model. PydanticAI supports many different LLM providers, including Anthropic, Google, Cohere, etc. which can also be used. While initializing the agent, we also pass the type of the dependencies. This is mainly used for type checking, and not actually used at runtime.
@dataclass
class Deps:
vector_store: CouchbaseSearchVectorStore
agent = Agent("openai:gpt-4o", deps_type=Deps)
PydanticAI has the concept of function tools
, which are functions that can be called by LLMs to retrieve extra information that can help form a better response.
We can perform RAG by creating a tool which retrieves documents that are semantically similar to the query, and allowing the agent to call the tool when required. We can add the function as a tool using the @agent.tool
decorator.
Notice that we also add the context
parameter, which contains the dependencies that are passed to the tool (in this case, the only dependency is the vector store).
@agent.tool
async def retrieve(context: RunContext[Deps], search_query: str) -> str:
"""Retrieve news data based on a search query.
Args:
context: The call context
search_query: The search query
"""
search_results = context.deps.vector_store.similarity_search_with_score(search_query, k=5)
return "\n\n".join(
f"# Documents:\n{doc.page_content}"
for doc, score in search_results
)
Finally, we create a function that allows us to define our dependencies and run our agent.
async def run_agent(question: str):
deps = Deps(
vector_store=vector_store,
)
answer = await agent.run(question, deps=deps)
return answer
We have now finished setting up our vector store and agent! The system is now ready to accept queries.
query = "What was manchester city manager pep guardiola's reaction to the team's current form?"
output = await run_agent(query)
print("=" * 20, "Agent Output", "=" * 20)
print(output.data)
2025-04-11 13:56:53,839 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2025-04-11 13:56:54,485 - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2025-04-11 13:57:01,928 - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
==================== Agent Output ====================
Pep Guardiola has expressed a mix of determination and concern regarding Manchester City's current form. He acknowledged the personal impact of the team's downturn, admitting that the situation has affected his sleep and diet due to the worst run of results he has ever faced in his managerial career. Guardiola described his state of mind as "ugly," noting the team's precarious position in competitions and the need to defend better and avoid mistakes.
Despite these challenges, Guardiola remains committed to finding solutions, emphasizing the need to improve defensive concepts and restore the team's intensity and form. He acknowledged the errors from some of the best players in the world and expressed a need for the team to stay positive and for players to have the necessary support to overcome their current struggles.
Moreover, Guardiola expressed a pragmatic view of the situation, accepting that the team must "survive" the season and acknowledging a potential need for a significant rebuild to address the challenges they're facing. As a testament to his commitment, he noted his intention to continue shaping the club during his newly extended contract period. Throughout, he reiterated his belief in the team and emphasized the need to find a way forward.
We can use the all_messages()
method in the output object to observe how the agent and tools work.
In the cell below, we see an extremely detailed list of all the model's messages and tool calls, which happens step by step:
UserPromptPart
, which consists of the query the user sends to the agent.retrieve
tool in the ToolCallPart
message. This includes the search_query
argument. Couchbase uses this search_query
to perform semantic search over all the ingested news articles.retrieve
tool returns a ToolReturnPart
object with all the context required for the model to answer the user's query. The retrieve documents were truncated, because a large amount of context was retrieved.from pprint import pprint
for idx, message in enumerate(output.all_messages(), start=1):
print(f"Step {idx}:")
pprint(message.__repr__())
print("=" * 50)
Step 1:
('ModelRequest(parts=[UserPromptPart(content="What was manchester city manager '
'pep guardiola\'s reaction to the team\'s current form?", '
'timestamp=datetime.datetime(2025, 4, 11, 8, 26, 52, 836357, '
"tzinfo=datetime.timezone.utc), part_kind='user-prompt')], kind='request')")
==================================================
Step 2:
("ModelResponse(parts=[ToolCallPart(tool_name='retrieve', "
'args=\'{"search_query":"Pep Guardiola reaction to Manchester City current '
'form"}\', tool_call_id=\'call_oo4Jjn93VkRJ3q9PnAwkt3xm\', '
"part_kind='tool-call')], model_name='gpt-4o-2024-08-06', "
'timestamp=datetime.datetime(2025, 4, 11, 8, 26, 53, '
"tzinfo=datetime.timezone.utc), kind='response')")
==================================================
Step 3:
("ModelRequest(parts=[ToolReturnPart(tool_name='retrieve', content='# "
'Documents:\\nManchester City boss Pep Guardiola has won 18 trophies since he '
'arrived at the club in 2016\\n\\nManchester City boss Pep Guardiola says he '
'is "fine" despite admitting his sleep and diet are being affected by the '
'worst run of results in his entire managerial career. In an interview with '
'former Italy international Luca Toni for Amazon Prime Sport before '
"Wednesday\\'s Champions League defeat by Juventus, Guardiola touched on the "
"personal impact City\\'s sudden downturn in form has had. Guardiola said his "
'state of mind was "ugly", that his sleep was "worse" and he was eating '
"lighter as his digestion had suffered. City go into Sunday\\'s derby against "
'Manchester United at Etihad Stadium having won just one of their past 10 '
'games. The Juventus loss means there is a chance they may not even secure a '
'play-off spot in the Champions League. Asked to elaborate on his comments to '
'Toni, Guardiola said: "I\\\'m fine. "In our jobs we always want to do our '
"best or the best as possible. When that doesn\\'t happen you are more "
'uncomfortable than when the situation is going well, always that happened. '
'"In good moments I am happier but when I get to the next game I am still '
'concerned about what I have to do. There is no human being that makes an '
'activity and it doesn\\\'t matter how they do." Guardiola said City have to '
'defend better and "avoid making mistakes at both ends". To emphasise his '
"point, Guardiola referred back to the third game of City\\'s current run, "
'against a Sporting side managed by Ruben Amorim, who will be in the United '
'dugout at the weekend. City dominated the first half in Lisbon, led thanks '
"to Phil Foden\\'s early effort and looked to be cruising. Instead, they "
'conceded three times in 11 minutes either side of half-time as Sporting '
'eventually ran out 4-1 winners. "I would like to play the game like we '
'played in Lisbon on Sunday, believe me," said Guardiola, who is facing the '
'prospect of only having three fit defenders for the derby as Nathan Ake and '
'Manuel Akanji try to overcome injury concerns. If there is solace for City, '
'it comes from the knowledge United are not exactly flying. Their comeback '
'Europa League victory against Viktoria Plzen on Thursday was their third win '
"of Amorim\\'s short reign so far but only one of those successes has come in "
'the Premier League, where United have lost their past two games against '
'Arsenal and Nottingham Forest. Nevertheless, Guardiola can see improvements '
'already on the red side of the city. "It\\\'s already there," he said. "You '
'see all the patterns, the movements, the runners and the pace. He will do a '
'good job at United, I\\\'m pretty sure of that."\\n\\nGuardiola says skipper '
'Kyle Walker has been offered support by the club after the City defender '
'highlighted the racial abuse he had received on social media in the wake of '
'the Juventus trip. "It\\\'s unacceptable," he said. "Not because it\\\'s '
'Kyle - for any human being. "Unfortunately it happens many times in the real '
'world. It is not necessary to say he has the support of the entire club. It '
'is completely unacceptable and we give our support to him."\\n\\n# '
'Documents:\\nPep Guardiola has said Manchester City will be his final '
'managerial job in club football before he "maybe" coaches a national '
'team.\\n\\nThe former Barcelona and Bayern Munich boss has won 15 major '
'trophies since taking charge of City in 2016.\\n\\nThe 53-year-old Spaniard '
'was approached in the summer about the possibility of becoming England '
'manager, but last month signed a two-year contract extension with City until '
'2027.\\n\\nSpeaking to celebrity chef Dani Garcia on YouTube, Guardiola did '
'not indicate when he intends to step down at City but said he would not '
'return to club football - in the Premier League or overseas.\\n\\n"I\\\'m '
'not going to manage another team," he said.\\n\\n"I\\\'m not talking about '
"the long-term future, but what I\\'m not going to do is leave Manchester "
'City, go to another country, and do the same thing as now.\\n\\n"I '
"wouldn\\'t have the energy. The thought of starting somewhere else, all the "
'process of training and so on. No, no, no. Maybe a national team, but '
'that\\\'s different.\\n\\n"I want to leave it and go and play golf, but I '
"can\\'t [if he takes a club job]. I think stopping would do me "
'good."\\n\\nCity have won just once since Guardiola extended his contract - '
'and once in nine games since beating Southampton on 26 October.\\n\\nThat '
'victory came at home to Nottingham Forest last Wednesday, but was followed '
'by a 2-2 draw at Crystal Palace at the weekend.\\n\\nThe Blues visit '
'Juventus next in the Champions League on Wednesday (20:00 GMT), before '
'hosting Manchester United in the Premier League on Sunday '
'(16:30).\\n\\n"Right now we are not in the position - when we have had the '
'results of the last seven, eight games - to talk about winning games in '
'plural," said Guardiola at his pre-match news conference.\\n\\n"We have to '
'win the game and not look at what happens in the next one yet."\\n\\n# '
"Documents:\\n\\'I am not good enough\\' - Guardiola faces daunting and major "
'rebuild\\n\\nThis video can not be played To play this video you need to '
"enable JavaScript in your browser. \\'I am not good enough\\' - Guardiola "
"says he must find a \\'solution\\' after derby loss\\n\\nPep Guardiola says "
"his sleep has suffered during Manchester City\\'s deepening crisis, so he "
'will not be helped by a nightmarish conclusion to one of the most stunning '
'defeats of his long reign. Guardiola looked agitated, animated and on edge '
"even after City led the Manchester derby through Josko Gvardiol\\'s "
'36th-minute header, his reaction to the goal one of almost disdain that it '
'came via a deflected cross as opposed to in his purist style. He sat alone '
'with his eyes closed sipping from a water bottle before the resumption of '
'the second half, then was denied even the respite of victory when Manchester '
'United gave this largely dismal derby a dramatic conclusion it barely '
'deserved with a remarkable late comeback. First, with 88 minutes on the '
'clock, Matheus Nunes presented Amad Diallo with the ball before compounding '
'his error by flattening the forward as he made an attempt to recover his '
'mistake. Bruno Fernandes completed the formalities from the penalty spot. '
"Worse was to come two minutes later when Lisandro Martinez\\'s routine long "
"ball caught City\\'s defence inexplicably statuesque. Goalkeeper Ederson\\'s "
'positioning was awry, allowing the lively Diallo to pounce from an acute '
'angle to leave Guardiola and his players stunned. It was the latest into any '
'game, 88 minutes, that reigning Premier League champions had led then lost. '
'It was also the first time City had lost a game they were leading so late '
"on. And in a sign of City\\'s previous excellence that is now being "
'challenged, they have only lost four of 105 Premier League home games under '
'Guardiola in which they have been ahead at half-time, winning 94 and drawing '
'seven. Guardiola delivered a brutal self-analysis as he told Match of the '
'Day: "I am not good enough. I am the boss. I am the manager. I have to find '
'solutions and so far I haven\\\'t. That\\\'s the reality. "Not much else to '
'say. No defence. Manchester United were incredibly persistent. We have not '
'lost eight games in two seasons. We can\\\'t defend that."\\n\\nManchester '
'City manager Pep Guardiola in despair during the derby defeat to Manchester '
'United\\n\\nGuardiola suggested the serious renewal will wait until the '
'summer but the red flags have been appearing for weeks in the sudden and '
'shocking decline of a team that has lost the aura of invincibility that left '
'many opponents beaten before kick-off in previous years. He has had stated '
'City must "survive" this season - whatever qualifies as survival for a club '
'of such rich ambition - but the quest for a record fifth successive Premier '
'League title is surely over as they lie nine points behind leaders Liverpool '
'having played a game more. Their Champions League aspirations are also in '
"jeopardy after another loss, this time against Juventus in Turin. City\\'s "
'squad has been allowed to grow too old together. The insatiable thirst for '
'success seems to have gone, the scales of superiority have fallen away and '
'opponents now sense vulnerability right until the final whistle, as United '
'did here. The manner in which United were able, and felt able, to snatch '
'this victory drove right to the heart of how City, and Guardiola, are '
'allowing opponents to prey on their downfall. Guardiola has every reason to '
'cite injuries, most significantly to Rodri and also John Stones as well as '
'others, but this cannot be used an excuse for such a dramatic decline in '
'standards, allied to the appearance of a soft underbelly that is so easily '
"exploited. And City\\'s rebuild will not be a quick fix. With every "
'performance, every defeat, the scale of what lies in front of Guardiola '
"becomes more obvious - and daunting. Manchester City\\'s fans did their best "
'to reassure Guardiola of their faith in him with a giant Barcelona-inspired '
'banner draped from the stands before kick-off emblazoned with his image '
'reading "Més que un entrenador" - "More Than A Coach". And Guardiola will '
'now need to be more than a coach than at any time in his career. He will '
"have the finances but it will be done with City\\'s challengers also "
'strengthening. Kevin de Bruyne, 34 in June, lasted 68 minutes here before he '
'was substituted. Age and injuries are catching up with one of the greatest '
'players of the Premier League era and he is unlikely to be at City next '
'season. Mateo Kovacic, who replaced De Bruyne, is also 31 in May. Kyle '
'Walker, 34, is being increasingly exposed. His most notable contribution '
'here was an embarrassing collapse to the ground after the mildest '
'head-to-head collision with Rasmus Hojlund. Ilkay Gundogan, another '
"34-year-old and a previous pillar of Guardiola\\'s great successes, no "
'longer has the legs or energy to exert influence. This looks increasingly '
'like a season too far following his return from Barcelona. Flaws are also '
'being exposed elsewhere, with previously reliable performers failing to hit '
'previous standards. Phil Foden scored 27 goals and had 12 assists when he '
'was Premier League Player of the Season last term. This year he has just '
'three goals and two assists in 18 appearances in all competitions. He has no '
'goals and just one assist in 11 Premier League games. Jack Grealish, who '
'came on after 77 minutes against United, has not scored in a year for '
'Manchester City, his last goal coming in a 2-2 draw against Crystal Palace '
'on 16 December last year. He has, in the meantime, scored twice for England. '
'Erling Haaland is also struggling as City lack creativity and cutting edge. '
'He has three goals in his past 11 Premier League games after scoring 10 in '
"his first five. And in another indication of City\\'s impotence, and their "
"reliance on Haaland, defender Gvardiol\\'s goal against United was his "
'fourth this season, making him their second highest scorer in all '
'competitions behind the Norwegian striker, who has 18. Goalkeeper Ederson, '
'so reliable for so long, has already been dropped once this season and did '
"not cover himself in glory for United\\'s winner. Guardiola, with that "
'freshly signed two-year contract, insists he "wants it" as he treads on this '
'alien territory of failure. He will be under no illusions about the size of '
'the job in front of him as he placed his head in his hands in anguish after '
'yet another damaging and deeply revealing defeat. City and Guardiola are in '
"new, unforgiving territory.\\n\\n# Documents:\\n\\'Self-doubt, errors & big "
"changes\\' - inside the crisis at Man City\\n\\nPep Guardiola has not been "
'through a moment like this in his managerial career. Manchester City have '
'lost nine matches in their past 12 - as many defeats as they had suffered in '
'their previous 106 fixtures. At the end of October, City were still unbeaten '
'at the top of the Premier League and favourites to win a fifth successive '
'title. Now they are seventh, 12 points behind leaders Liverpool having '
'played a game more. It has been an incredible fall from grace and left '
'people trying to work out what has happened - and whether Guardiola can make '
'it right. After discussing the situation with those who know him best, I '
'have taken a closer look at the future - both short and long term - and how '
"the current crisis at Man City is going to be solved.\\n\\nPep Guardiola\\'s "
'Man City have lost nine of their past 12 matches\\n\\nGuardiola has also '
'been giving it a lot of thought. He has not been sleeping very well, as he '
'has said, and has not been himself at times when talking to the media. He '
'has been talking to a lot of people about what is going on as he tries to '
"work out the reasons for City\\'s demise. Some reasons he knows, others he "
"still doesn\\'t. What people perhaps do not realise is Guardiola hugely "
'doubts himself and always has. He will be thinking "I\\\'m not going to be '
'able to get us out of this" and needs the support of people close to him to '
'push away those insecurities - and he has that. He is protected by his '
'people who are very aware, like he is, that there are a lot of people that '
'want City to fail. It has been a turbulent time for Guardiola. Remember '
'those marks he had on his head after the 3-3 draw with Feyenoord in the '
'Champions League? He always scratches his head, it is a gesture of '
'nervousness. Normally nothing happens but on that day one of his nails was '
'far too sharp so, after talking to the players in the changing room where he '
'scratched his head because of his usual agitated gesturing, he went to the '
'news conference. His right-hand man Manel Estiarte sent him photos in a '
'message saying "what have you got on your head?", but by the time Guardiola '
'returned to the coaching room there was hardly anything there again. He '
'started that day with a cover on his nose after the same thing happened at '
'the training ground the day before. Guardiola was having a footballing '
'debate with Kyle Walker about positional stuff and marked his nose with that '
'same nail. There was also that remarkable news conference after the '
'Manchester derby when he said "I don\\\'t know what to do". That is partly '
'true and partly not true. Ignore the fact Guardiola suggested he was "not '
'good enough". He actually meant he was not good enough to resolve the '
'situation with the group of players he has available and with all the other '
'current difficulties. There are obviously logical explanations for the '
'crisis and the first one has been talked about many times - the absence of '
'injured midfielder Rodri. You know the game Jenga? When you take the wrong '
'piece out, the whole tower collapses. That is what has happened here. It is '
'normal for teams to have an over-reliance on one player if he is the best in '
'the world in his position. And you cannot calculate the consequences of an '
'injury that rules someone like Rodri out for the season. City are a team, '
'like many modern ones, in which the holding midfielder is a key element to '
'the construction. So, when you take Rodri out, it is difficult to hold it '
'together. There were Plan Bs - John Stones, Manuel Akanji, even Nathan Ake - '
'but injuries struck. The big injury list has been out of the ordinary and '
'the busy calendar has also played a part in compounding the issues. However, '
'one factor even Guardiola cannot explain is the big uncharacteristic errors '
'in almost every game from international players. Why did Matheus Nunes make '
'that challenge to give away the penalty against Manchester United? Jack '
'Grealish is sent on at the end to keep the ball and cannot do that. There '
'are errors from Walker and other defenders. These are some of the best '
"players in the world. Of course the players\\' mindset is important, and "
'confidence is diminishing. Wrong decisions get taken so there is almost '
'panic on the pitch instead of calm. There are also players badly out of form '
'who are having to play because of injuries. Walker is now unable to hide '
"behind his pace, I\\'m not sure Kevin de Bruyne is ever getting back to the "
'level he used to be at, Bernardo Silva and Ilkay Gundogan do not have time '
'to rest, Grealish is not playing at his best. Some of these players were '
'only meant to be playing one game a week but, because of injuries, have '
'played 12 games in 40 days. It all has a domino effect. One consequence is '
"that Erling Haaland isn\\'t getting the service to score. But the Norwegian "
"still remains City\\'s top-scorer with 13. Defender Josko Gvardiol is next "
'on the list with just four. The way their form has been analysed inside the '
'City camp is there have only been three games where they deserved to lose '
'(Liverpool, Bournemouth and Aston Villa). But of course it is time to change '
'the dynamic.\\n\\nGuardiola has never protected his players so much. He has '
'not criticised them and is not going to do so. They have won everything with '
'him. Instead of doing more with them, he has tried doing less. He has '
'sometimes given them more days off to clear their heads, so they can reset - '
'two days this week for instance. Perhaps the time to change a team is when '
'you are winning, but no-one was suggesting Man City were about to collapse '
'when they were top and unbeaten after nine league games. Some people have '
'asked how bad it has to get before City make a decision on Guardiola. The '
'answer is that there is no decision to be made. Maybe if this was Real '
'Madrid, Barcelona or Juventus, the pressure from outside would be massive '
'and the argument would be made that Guardiola has to go. At City he has won '
'the lot, so how can anyone say he is failing? Yes, this is a crisis. But '
"given all their problems, City\\'s renewed target is finishing in the top "
'four. That is what is in all their heads now. The idea is to recover their '
'essence by improving defensive concepts that are not there and '
're-establishing the intensity they are known for. Guardiola is planning to '
'use the next two years of his contract, which is expected to be his last as '
'a club manager, to prepare a new Manchester City. When he was at the end of '
'his four years at Barcelona, he asked two managers what to do when you feel '
'people are not responding to your instructions. Do you go or do the players '
'go? Sir Alex Ferguson and Rafael Benitez both told him that the players need '
'to go. Guardiola did not listen because of his emotional attachment to his '
'players back then and he decided to leave the Camp Nou because he felt the '
'cycle was over. He will still protect his players now but there is not the '
'same emotional attachment - so it is the players who are going to leave this '
'time. It is likely City will look to replace five or six regular starters. '
'Guardiola knows it is the end of an era and the start of a new one. Changes '
'will not be immediate and the majority of the work will be done in the '
'summer. But they are open to any opportunities in January - and a holding '
'midfielder is one thing they need. In the summer City might want to get '
"Spain\\'s Martin Zubimendi from Real Sociedad and they know 60m euros (£50m) "
'will get him. He said no to Liverpool last summer even though everything was '
'agreed, but he now wants to move on and the Premier League is the target. '
'Even if they do not get Zubimendi, that is the calibre of footballer they '
'are after. A new Manchester City is on its way - with changes driven by '
'Guardiola, incoming sporting director Hugo Viana and the football '
"department.\\n\\n# Documents:\\n\\'We have to find a way\\' - Guardiola vows "
'to end relegation form\\n\\nThis video can not be played To play this video '
"you need to enable JavaScript in your browser. \\'Worrying\\' and "
"\\'staggering\\' - Why do Manchester City keep conceding?\\n\\nManchester "
'City are currently in relegation form and there is little sign of it ending. '
"Saturday\\'s 2-1 defeat at Aston Villa left them joint bottom of the form "
'table over the past eight games with just Southampton for company. Saints, '
'at the foot of the Premier League, have the same number of points, four, as '
'City over their past eight matches having won one, drawn one and lost six - '
'the same record as the floundering champions. And if Southampton - who '
'appointed Ivan Juric as their new manager on Saturday - get at least a point '
'at Fulham on Sunday, City will be on the worst run in the division. Even '
"Wolves, who sacked boss Gary O\\'Neil last Sunday and replaced him with "
'Vitor Pereira, have earned double the number of points during the same '
'period having played a game fewer. They are damning statistics for Pep '
'Guardiola, even if he does have some mitigating circumstances with injuries '
'to Ederson, Nathan Ake and Ruben Dias - who all missed the loss at Villa '
'Park - and the long-term loss of midfield powerhouse Rodri. Guardiola was '
"happy with Saturday\\'s performance, despite defeat in Birmingham, but there "
'is little solace to take at slipping further out of the title race. He may '
'have needed to field a half-fit Manuel Akanji and John Stones at Villa Park '
'but that does not account for City looking a shadow of their former selves. '
'That does not justify the error Josko Gvardiol made to gift Jhon Duran a '
'golden chance inside the first 20 seconds, or £100m man Jack Grealish again '
'failing to have an impact on a game. There may be legitimate reasons for '
"City\\'s drop off, whether that be injuries, mental fatigue or just simply a "
'team coming to the end of its lifecycle, but their form, which has plunged '
'off a cliff edge, would have been unthinkable as they strolled to a fourth '
'straight title last season. "The worrying thing is the number of goals '
'conceded," said ex-England captain Alan Shearer on BBC Match of the Day. '
'"The number of times they were opened up because of the lack of protection '
'and legs in midfield was staggering. There are so many things that are wrong '
'at this moment in time."\\n\\nThis video can not be played To play this '
"video you need to enable JavaScript in your browser. Man City \\'have to "
"find a way\\' to return to form - Guardiola\\n\\nAfterwards Guardiola was "
'calm, so much so it was difficult to hear him in the news conference, a '
'contrast to the frustrated figure he cut on the touchline. He said: "It '
'depends on us. The solution is bring the players back. We have just one '
'central defender fit, that is difficult. We are going to try next game - '
'another opportunity and we don\\\'t think much further than that. "Of course '
"there are more reasons. We concede the goals we don\\'t concede in the past, "
"we [don\\'t] score the goals we score in the past. Football is not just one "
'reason. There are a lot of little factors. "Last season we won the Premier '
'League, but we came here and lost. We have to think positive and I have '
'incredible trust in the guys. Some of them have incredible pride and desire '
'to do it. We have to find a way, step by step, sooner or later to find a way '
'back." Villa boss Unai Emery highlighted City\\\'s frailties, saying he felt '
'Villa could seize on the visitors\\\' lack of belief. "Manchester City are a '
'little bit under the confidence they have normally," he said. "The second '
'half was different, we dominated and we scored. Through those circumstances '
'they were feeling worse than even in the first half."\\n\\nErling Haaland '
'had one touch in the Villa box\\n\\nThere are chinks in the armour never '
'seen before at City under Guardiola and Erling Haaland conceded belief '
'within the squad is low. He told TNT after the game: "Of course, [confidence '
'levels are] not the best. We know how important confidence is and you can '
'see that it affects every human being. That is how it is, we have to '
'continue and stay positive even though it is difficult." Haaland, with 76 '
'goals in 83 Premier League appearances since joining City from Borussia '
'Dortmund in 2022, had one shot and one touch in the Villa box. His 18 '
'touches in the whole game were the lowest of all starting players and he has '
'been self critical, despite scoring 13 goals in the top flight this season. '
"Over City\\'s last eight games he has netted just twice though, but "
'Guardiola refused to criticise his star striker. He said: "Without him we '
"will be even worse but I like the players feeling that way. I don\\'t agree "
'with Erling. He needs to have the balls delivered in the right spots but he '
'will fight for the next one."\', '
"tool_call_id='call_oo4Jjn93VkRJ3q9PnAwkt3xm', "
'timestamp=datetime.datetime(2025, 4, 11, 8, 26, 54, 510742, '
"tzinfo=datetime.timezone.utc), part_kind='tool-return')], kind='request')")
==================================================
Step 4:
("ModelResponse(parts=[TextPart(content='Pep Guardiola has expressed a mix of "
"determination and concern regarding Manchester City\\'s current form. He "
"acknowledged the personal impact of the team\\'s downturn, admitting that "
'the situation has affected his sleep and diet due to the worst run of '
'results he has ever faced in his managerial career. Guardiola described his '
'state of mind as "ugly," noting the team\\\'s precarious position in '
'competitions and the need to defend better and avoid mistakes.\\n\\nDespite '
'these challenges, Guardiola remains committed to finding solutions, '
"emphasizing the need to improve defensive concepts and restore the team\\'s "
'intensity and form. He acknowledged the errors from some of the best players '
'in the world and expressed a need for the team to stay positive and for '
'players to have the necessary support to overcome their current '
'struggles.\\n\\nMoreover, Guardiola expressed a pragmatic view of the '
'situation, accepting that the team must "survive" the season and '
'acknowledging a potential need for a significant rebuild to address the '
"challenges they\\'re facing. As a testament to his commitment, he noted his "
'intention to continue shaping the club during his newly extended contract '
'period. Throughout, he reiterated his belief in the team and emphasized the '
"need to find a way forward.', part_kind='text')], "
"model_name='gpt-4o-2024-08-06', timestamp=datetime.datetime(2025, 4, 11, 8, "
"26, 54, tzinfo=datetime.timezone.utc), kind='response')")
==================================================