Contents
- What Is an AI Agent? The Four-Component Architecture
- The Agent Loop: How It Works in Practice
- Step 1: Setting Up the Environment
- Prerequisites
- Project Structure
- Environment Setup
- Step 2: Defining the System Prompt
- Step 3: Defining and Implementing Tools
- Tool Schema: How the LLM Sees Your Tools
- Step 4: Implementing the Agent Loop
- Step 5: Adding Memory
- Short-Term Memory: Conversation History
- Long-Term Memory: Vector Database Retrieval
- Handling Failures: The Three Most Common Agent Bugs
- Bug 1: Infinite Tool Loops
- Bug 2: Hallucinated Tool Parameters
- Bug 3: Context Window Overflow
- Framework Comparison: Build AI Agent from Scratch vs. LangChain vs. CrewAI — A Developer Guide
- The Same Agent in LangChain: A Comparison
- A Real-World Agent Pattern: The Research Agent
- Evaluating Your Agent: Does It Actually Work?
- What Makes Mumbai's Top GenAI Engineers Stand Out
- Build Your First Production Agent: Your Next Step
Every developer who has used ChatGPT or GitHub Copilot has interacted with an LLM. Far fewer have built a system that uses an LLM as a reasoning engine — a system that can plan, remember, use external tools, and complete multi-step tasks autonomously. That is what this guide is about. This build AI agent from scratch developer guide walks you through every component: what it is, why it matters, and exactly how to implement it in Python — from the core ReAct loop to memory, tool integration, and production deployment.
We will go from first principles to a working, multi-tool agent — with real Python code at each step.
By the end, you will understand what separates a chatbot from an agent, why agents fail in production (and how to prevent it), and how frameworks like LangChain and CrewAI compare to building from scratch.
What Is an AI Agent? The Four-Component Architecture
Before writing a single line of code, you need a precise mental model. An AI agent is not simply "an LLM that does things." It is a system with four interconnected components working together:
AI Agent = LLM + Planning + Memory + Tool Use
1. LLM (The Reasoning Engine) The LLM is the brain — the component that understands natural language, reasons about problems, and decides what to do next. In an agent, the LLM is not just generating text; it is making decisions about which tools to call, in what order, with what parameters.
2. Planning (The Goal-Decomposition Layer) Planning is the ability to break a complex goal into a sequence of steps. A simple chatbot has no planning — it responds to the current message. An agent can say: "To answer this question, I need to (1) search the web, (2) retrieve the relevant section, (3) synthesise an answer." Planning can be implicit (the LLM reasons through steps in its context) or explicit (a structured plan is generated and executed step by step).
3. Memory (The Persistence Layer) Memory is how the agent maintains context and state. There are four types:
- Short-term (in-context): The current conversation history, held in the LLM's context window
- Long-term (external): A vector database or key-value store the agent can read from and write to across sessions
- Episodic: Records of past actions and their outcomes, used to improve future decisions
- Semantic: A structured knowledge base of facts the agent knows about its domain
4. Tool Use (The Action Layer) Tools are functions the agent can invoke — web search, database queries, code execution, API calls, file read/write, calculator, email sender. Tool use is what separates an agent that talks about doing things from one that actually does them.
[Insert Diagram: The Four-Component Agent Architecture — LLM at centre, connected to Planning, Memory, and Tools]
The interaction loop between these four components is what makes an agent an agent:
Goal → Plan → Action (Tool Call) → Observation → Revised Plan → Next Action → ... → Final Answer
This loop — often called the ReAct loop (Reason + Act) — is the core pattern you will implement.
The Agent Loop: How It Works in Practice
Here is the ReAct loop in plain language, before any code:
- The user gives the agent a goal: "Find the current price of Infosys stock and tell me if it's above its 50-day moving average."
- The agent's LLM reasons: "I need to (1) fetch the current Infosys stock price, (2) fetch the 50-day moving average, (3) compare them."
- The LLM acts: calls the
get_stock_pricetool withticker="INFY" - The tool returns an observation:
{"price": 1842.50, "currency": "INR"} - The LLM reasons again with the new observation: "I have the current price. Now I need the moving average."
- The LLM acts: calls
get_moving_averagetool withticker="INFY", period=50 - The tool returns:
{"ma_50": 1793.20} - The LLM has enough information to produce a final answer: "Infosys is currently trading at ₹1,842.50, which is above its 50-day moving average of ₹1,793.20 — a bullish signal."
Each iteration of this loop is a step. Production agents might run 5–20 steps for complex tasks. The art of agent design is making each step reliable, cost-efficient, and failure-resistant.
Step 1: Setting Up the Environment
We will build our agent in Python. The implementation uses only the OpenAI API directly — no LangChain or other abstractions — so you understand every component before reaching for a framework.
Prerequisites
# Create a virtual environment
python -m venv agent-env
source agent-env/bin/activate # Windows: agent-env\Scripts\activate
# Install dependencies
pip install openai python-dotenv requests
Project Structure
ai-agent/
├── agent.py # Main agent loop
├── tools.py # Tool definitions and implementations
├── memory.py # Memory management
├── system_prompt.py # System prompt construction
├── .env # API keys (never commit this)
└── main.py # Entry point
Environment Setup
# .env
OPENAI_API_KEY=your_openai_api_key_here
# main.py
from dotenv import load_dotenv
load_dotenv()
from agent import Agent
agent = Agent()
response = agent.run("What is 15% of 847, and what is the square root of that result?")
print(response)
Step 2: Defining the System Prompt
The system prompt is the most important piece of engineering in your agent. It defines the agent's identity, capabilities, reasoning style, and constraints. A weak system prompt produces an agent that hallucinates tool calls, loops endlessly, or gives up too early. A strong system prompt produces an agent that reasons clearly and uses tools purposefully.
# system_prompt.py
AGENT_SYSTEM_PROMPT = """You are a precise, task-focused AI assistant with access to a set of tools.
## Your Behaviour
- Break complex tasks into clear steps before acting
- Use tools when you need real-world information or computation — do not guess
- After each tool call, reason about the result before deciding the next step
- When you have enough information to answer, provide a clear, direct final answer
- If a tool call fails, explain the failure and try an alternative approach
## Tool Use Rules
- Only call tools that exist in the provided tool list
- Always provide required parameters — never call a tool with missing inputs
- Do not call the same tool with the same parameters more than twice
- Do not make up tool results — always use the actual observation returned
## Reasoning Format
Think step by step. Before each action, state:
- What you know so far
- What you need to find out
- Which tool will get you that information
## Stopping Conditions
Stop when:
- You have a complete, accurate answer to the user's question
- You have exhausted available tools and can state what you could not find
- You have reached 10 tool calls (to prevent infinite loops)
Be concise, accurate, and action-oriented."""
Why this system prompt structure works:
- Behavioural anchors ("Use tools when you need real-world information — do not guess") prevent hallucination
- Tool use rules prevent infinite loops and parameter errors
- Reasoning format enforces the explicit chain-of-thought that makes agent decisions auditable
- Stopping conditions are the safety valve — every production agent needs them
Step 3: Defining and Implementing Tools
Tools are Python functions exposed to the LLM via a schema. The LLM does not call the function directly — it generates a structured JSON object requesting the function call, your agent code intercepts that request, executes the actual function, and returns the result to the LLM as an observation.
Tool Schema: How the LLM Sees Your Tools
# tools.py
import requests
import math
import json
# --- Tool Implementations ---
def calculator(expression: str) -> str:
"""Safely evaluate a mathematical expression."""
try:
# Use a safe evaluation approach
allowed_names = {k: v for k, v in math.__dict__.items() if not k.startswith("_")}
result = eval(expression, {"__builtins__": {}}, allowed_names)
return json.dumps({"result": result, "expression": expression})
except Exception as e:
return json.dumps({"error": str(e), "expression": expression})
def web_search(query: str) -> str:
"""Search the web for current information. Returns top results."""
# In production, replace with SerpAPI, Brave Search API, or Tavily
# This is a mock implementation for demonstration
mock_results = [
{
"title": f"Search result for: {query}",
"snippet": f"This is a mock result. In production, connect to a real search API.",
"url": "https://example.com/result-1"
}
]
return json.dumps({"query": query, "results": mock_results})
def get_current_time() -> str:
"""Get the current date and time."""
from datetime import datetime
now = datetime.now()
return json.dumps({
"datetime": now.isoformat(),
"date": now.strftime("%Y-%m-%d"),
"time": now.strftime("%H:%M:%S"),
"timezone": "Asia/Kolkata"
})
def read_file(filepath: str) -> str:
"""Read the contents of a text file."""
try:
with open(filepath, 'r') as f:
content = f.read()
return json.dumps({"filepath": filepath, "content": content, "length": len(content)})
except FileNotFoundError:
return json.dumps({"error": f"File not found: {filepath}"})
except Exception as e:
return json.dumps({"error": str(e)})
# --- Tool Registry ---
TOOLS = [
{
"type": "function",
"function": {
"name": "calculator",
"description": "Evaluate a mathematical expression. Use for any arithmetic, percentages, square roots, or numerical calculations.",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "A valid Python math expression. Use math module functions like math.sqrt(), math.pow(), etc. Example: '0.15 * 847' or 'math.sqrt(127.05)'"
}
},
"required": ["expression"]
}
}
},
{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information, facts, news, or data you don't know. Use when the user asks about real-world information.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query. Be specific and precise."
}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Get the current date and time. Use when the user asks about the current date, time, or when you need to timestamp information.",
"parameters": {
"type": "object",
"properties": {}
}
}
},
{
"type": "function",
"function": {
"name": "read_file",
"description": "Read the contents of a text file from the local filesystem.",
"parameters": {
"type": "object",
"properties": {
"filepath": {
"type": "string",
"description": "The path to the file to read."
}
},
"required": ["filepath"]
}
}
}
]
# Tool dispatcher — maps tool names to functions
TOOL_FUNCTIONS = {
"calculator": calculator,
"web_search": web_search,
"get_current_time": get_current_time,
"read_file": read_file
}
The tool description is as important as the implementation. The LLM decides which tool to call based entirely on the description. Vague descriptions produce wrong tool selections. Precise descriptions with examples produce reliable tool selection.
Step 4: Implementing the Agent Loop
This is the core of the agent — the loop that executes the ReAct cycle until the agent reaches a final answer or a stopping condition.
# agent.py
import json
import os
from openai import OpenAI
from system_prompt import AGENT_SYSTEM_PROMPT
from tools import TOOLS, TOOL_FUNCTIONS
class Agent:
def __init__(self, model: str = "gpt-4o", max_steps: int = 10):
self.client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
self.model = model
self.max_steps = max_steps
def run(self, user_goal: str, verbose: bool = True) -> str:
"""
Execute the agent loop for a given user goal.
Returns the agent's final response.
"""
# Initialise conversation with system prompt and user goal
messages = [
{"role": "system", "content": AGENT_SYSTEM_PROMPT},
{"role": "user", "content": user_goal}
]
step = 0
while step < self.max_steps:
step += 1
if verbose:
print(f"\n--- Step {step} ---")
# Call the LLM
response = self.client.chat.completions.create(
model=self.model,
messages=messages,
tools=TOOLS,
tool_choice="auto" # Let the LLM decide whether to use tools
)
message = response.choices[0].message
# Add the assistant's response to conversation history
messages.append(message)
# Case 1: The LLM wants to use a tool
if message.tool_calls:
for tool_call in message.tool_calls:
tool_name = tool_call.function.name
tool_args = json.loads(tool_call.function.arguments)
if verbose:
print(f"Tool call: {tool_name}({tool_args})")
# Execute the tool
if tool_name in TOOL_FUNCTIONS:
observation = TOOL_FUNCTIONS[tool_name](**tool_args)
else:
observation = json.dumps({"error": f"Unknown tool: {tool_name}"})
if verbose:
print(f"Observation: {observation}")
# Add the tool result to conversation history
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": observation
})
# Case 2: The LLM has produced a final answer
else:
final_answer = message.content
if verbose:
print(f"\n--- Final Answer ---\n{final_answer}")
return final_answer
# Reached max steps without a final answer
return "I reached the maximum number of steps without completing the task. Here is my last response: " + (message.content or "No response generated.")
Let's trace through what happens when you run this:
agent = Agent()
result = agent.run("What is 15% of 847, and what is the square root of that result?")
--- Step 1 ---
Tool call: calculator({'expression': '0.15 * 847'})
Observation: {"result": 127.05, "expression": "0.15 * 847"}
--- Step 2 ---
Tool call: calculator({'expression': 'math.sqrt(127.05)'})
Observation: {"result": 11.272533..., "expression": "math.sqrt(127.05)"}
--- Final Answer ---
15% of 847 is **127.05**, and the square root of 127.05 is approximately **11.27**.
The agent correctly decomposed the task into two sequential tool calls and produced an accurate final answer. This is the ReAct loop working exactly as designed.
Step 5: Adding Memory
The agent above has no memory between sessions. Each call to agent.run() starts fresh. For most real-world applications, you need at least short-term conversational memory (within a session) and often long-term memory (across sessions).
Short-Term Memory: Conversation History
The simplest form of memory is maintaining the full conversation history within a session:
# memory.py
class ConversationMemory:
def __init__(self, max_history: int = 20):
self.messages = []
self.max_history = max_history
def add(self, role: str, content: str):
self.messages.append({"role": role, "content": content})
# Trim to prevent context window overflow
if len(self.messages) > self.max_history:
# Keep system message (index 0) + recent messages
self.messages = self.messages[:1] + self.messages[-(self.max_history - 1):]
def get_history(self) -> list:
return self.messages.copy()
def clear(self):
self.messages = []
Long-Term Memory: Vector Database Retrieval
For persistent memory across sessions, you need a vector database. Here is a minimal implementation using ChromaDB:
# Install: pip install chromadb sentence-transformers
import chromadb
from chromadb.utils import embedding_functions
import json
class LongTermMemory:
def __init__(self, collection_name: str = "agent_memory"):
self.client = chromadb.PersistentClient(path="./agent_memory_db")
self.ef = embedding_functions.DefaultEmbeddingFunction()
self.collection = self.client.get_or_create_collection(
name=collection_name,
embedding_function=self.ef
)
def store(self, content: str, metadata: dict = None):
"""Store a piece of information in long-term memory."""
import uuid
doc_id = str(uuid.uuid4())
self.collection.add(
documents=[content],
metadatas=[metadata or {}],
ids=[doc_id]
)
return doc_id
def retrieve(self, query: str, n_results: int = 3) -> list[str]:
"""Retrieve relevant memories based on a query."""
results = self.collection.query(
query_texts=[query],
n_results=min(n_results, self.collection.count() or 1)
)
return results["documents"][0] if results["documents"] else []
def get_relevant_context(self, query: str) -> str:
"""Format retrieved memories as a context string for the LLM."""
memories = self.retrieve(query)
if not memories:
return ""
formatted = "\n".join([f"- {m}" for m in memories])
return f"\n\n## Relevant context from previous sessions:\n{formatted}"
Integrating long-term memory into the agent:
# In agent.py — add memory retrieval before the main loop
def run_with_memory(self, user_goal: str, memory: LongTermMemory) -> str:
# Retrieve relevant memories
context = memory.get_relevant_context(user_goal)
# Augment the system prompt with retrieved context
augmented_system = AGENT_SYSTEM_PROMPT + context
messages = [
{"role": "system", "content": augmented_system},
{"role": "user", "content": user_goal}
]
# ... rest of the agent loop ...
# After completion, store the interaction in memory
memory.store(
f"User asked: {user_goal}. Agent answered: {final_answer}",
metadata={"type": "interaction"}
)
return final_answer
Handling Failures: The Three Most Common Agent Bugs
Bug 1: Infinite Tool Loops
Symptom: The agent keeps calling the same tool repeatedly without making progress.
Fix: Track tool call history and add deduplication logic:
tool_call_history = {}
for tool_call in message.tool_calls:
call_signature = f"{tool_name}:{json.dumps(tool_args, sort_keys=True)}"
if tool_call_history.get(call_signature, 0) >= 2:
observation = json.dumps({
"error": "This tool call has been attempted multiple times with the same parameters. Try a different approach."
})
else:
tool_call_history[call_signature] = tool_call_history.get(call_signature, 0) + 1
observation = TOOL_FUNCTIONS[tool_name](**tool_args)
Bug 2: Hallucinated Tool Parameters
Symptom: The agent calls a tool with parameters that don't match the schema, or invents parameter values.
Fix: Validate tool arguments before execution:
import jsonschema
def validate_and_call_tool(tool_name: str, tool_args: dict) -> str:
tool_schema = next(
(t["function"]["parameters"] for t in TOOLS if t["function"]["name"] == tool_name),
None
)
if tool_schema:
try:
jsonschema.validate(tool_args, tool_schema)
except jsonschema.ValidationError as e:
return json.dumps({"error": f"Invalid tool parameters: {str(e.message)}"})
return TOOL_FUNCTIONS[tool_name](**tool_args)
Bug 3: Context Window Overflow
Symptom: Errors when the conversation history grows too long for the model's context window.
Fix: Implement a sliding window with summarisation:
def trim_messages(messages: list, max_tokens: int = 8000) -> list:
"""Keep system message + most recent messages that fit in token budget."""
# Rough token estimate: 4 chars ≈ 1 token
def estimate_tokens(msg):
content = msg.get("content", "")
if isinstance(content, list):
content = str(content)
return len(content) // 4
system_msg = messages[0] # Always keep system prompt
recent_messages = messages[1:]
total_tokens = estimate_tokens(system_msg)
trimmed = []
for msg in reversed(recent_messages):
msg_tokens = estimate_tokens(msg)
if total_tokens + msg_tokens > max_tokens:
break
trimmed.insert(0, msg)
total_tokens += msg_tokens
return [system_msg] + trimmed
Framework Comparison: Build AI Agent from Scratch vs. LangChain vs. CrewAI — A Developer Guide
Building from scratch — as we have done above — gives you complete control and deep understanding. But most production agents use frameworks. Here is when each approach makes sense:
| Criteria | From Scratch | LangChain/LlamaIndex | LangGraph | CrewAI |
|---|---|---|---|---|
| Learning value | Highest — understand every component | Medium — abstractions hide complexity | Medium-High | Medium |
| Dev speed | Slowest | Fast | Medium | Fast |
| Flexibility | Complete | High | Very High | Medium |
| Multi-agent support | Build yourself | Limited | Excellent | Excellent |
| Production debugging | Easiest (you own the code) | Moderate | Good (with LangSmith) | Moderate |
| Best for | Learning, custom requirements | Single agents, RAG pipelines | Complex stateful workflows | Role-based multi-agent systems |
When to build from scratch: Learning, highly custom tool integrations, latency-critical systems where framework overhead matters, or when you need full control over error handling.
When to use LangChain: Rapid prototyping, standard RAG pipelines, connecting to a wide range of document loaders and vector stores.
When to use LangGraph: Stateful, cyclic agent workflows with complex branching logic. LangGraph represents agent state as a graph — ideal for workflows where different paths require different tools.
When to use CrewAI: Multi-agent systems where different agents have different roles (researcher, writer, critic) and need to collaborate on a task.
The Same Agent in LangChain: A Comparison
For reference, here is what the equivalent agent looks like using LangChain's create_tool_calling_agent:
# pip install langchain langchain-openai
from langchain_openai import ChatOpenAI
from langchain.agents import create_tool_calling_agent, AgentExecutor
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.tools import tool
@tool
def calculator(expression: str) -> str:
"""Evaluate a mathematical expression using Python's math module."""
import math, json
try:
result = eval(expression, {"__builtins__": {}, "math": math})
return json.dumps({"result": result})
except Exception as e:
return json.dumps({"error": str(e)})
llm = ChatOpenAI(model="gpt-4o", temperature=0)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant. Use tools when needed."),
("human", "{input}"),
("placeholder", "{agent_scratchpad}")
])
agent = create_tool_calling_agent(llm, [calculator], prompt)
executor = AgentExecutor(agent=agent, tools=[calculator], verbose=True)
result = executor.invoke({"input": "What is 15% of 847, and what is the square root?"})
print(result["output"])
LangChain handles the loop, tool dispatch, and message formatting — but the underlying mechanics are identical to what we built. The from-scratch implementation took ~100 lines; the LangChain version is ~20 lines. The trade-off is transparency vs. convenience.
A Real-World Agent Pattern: The Research Agent
Here is a practical, production-relevant agent pattern that Mumbai developers are using in 2026 — a research agent that takes a question, searches for information, synthesises it, and produces a structured report:
# research_agent.py
RESEARCH_SYSTEM_PROMPT = """You are a precise research assistant. Your job is to:
1. Understand the research question
2. Break it into 2-4 specific search queries
3. Execute each search
4. Synthesise the results into a structured report
Output format for the final report:
## Summary
[2-3 sentence executive summary]
## Key Findings
[Bullet points of the most important facts]
## Sources
[List of URLs from search results]
## Confidence Level
[High / Medium / Low — based on how well the search results answered the question]
Always cite sources. Never state something as fact if you did not find it in a search result."""
# Tools for the research agent
RESEARCH_TOOLS = [
# web_search tool (same schema as above)
# Plus a "save_to_file" tool for outputting the report
]
This pattern — specialised system prompt + curated tool set — is how production agents are structured. The general-purpose agent we built first is a learning vehicle; real agents are purpose-built for a specific task class.
Evaluating Your Agent: Does It Actually Work?
Building an agent that works in demo conditions is straightforward. Building one that works reliably across real inputs is engineering. Use these evaluation criteria:
Task completion rate: On a test set of 50 representative queries, what percentage does the agent complete correctly? Aim for >85% before considering production deployment.
Tool call accuracy: Are tools being called with correct parameters? Log all tool calls and review for parameter hallucinations.
Loop efficiency: What is the average number of steps to complete a task? High step counts (>8 for simple tasks) indicate reasoning inefficiency or tool quality issues.
Failure mode distribution: When the agent fails, how does it fail — does it give up gracefully, loop infinitely, or hallucinate a confident wrong answer? Graceful failures are acceptable; confident wrong answers are not.
# Simple evaluation harness
def evaluate_agent(agent: Agent, test_cases: list[dict]) -> dict:
results = {"passed": 0, "failed": 0, "errors": []}
for case in test_cases:
try:
response = agent.run(case["input"], verbose=False)
# Your validation logic here
if case["expected_contains"] in response.lower():
results["passed"] += 1
else:
results["failed"] += 1
results["errors"].append({
"input": case["input"],
"expected": case["expected_contains"],
"got": response[:200]
})
except Exception as e:
results["failed"] += 1
results["errors"].append({"input": case["input"], "error": str(e)})
results["accuracy"] = results["passed"] / len(test_cases)
return results
# Example test cases
test_cases = [
{"input": "What is 25% of 400?", "expected_contains": "100"},
{"input": "What is the square root of 144?", "expected_contains": "12"},
{"input": "What time is it?", "expected_contains": "time"},
]
What Makes Mumbai's Top GenAI Engineers Stand Out
The developers building production agents at Mumbai's top Fintech companies and GCCs are not just people who can run a pip install langchain and follow a tutorial. They are engineers who:
- Understand the ReAct loop well enough to debug it when it breaks at 2 AM
- Can write a system prompt that produces reliable, auditable agent behaviour across thousands of inputs
- Know when to reach for a framework and when to build from scratch
- Can evaluate agent quality systematically — not just "it works on my test case"
- Understand the cost implications of agent loops (each LLM call costs money; 10-step loops at scale are expensive)
- Have shipped an agent to production and survived the experience
This is not knowledge you gain from reading documentation. It comes from building, breaking, debugging, and rebuilding — ideally with guidance from engineers who have already made the expensive mistakes.
Build Your First Production Agent: Your Next Step
The code in this guide gives you a working agent. The gap between this working agent and a production-ready agent — one that handles edge cases reliably, integrates with real data sources, gets evaluated systematically, and can be extended by a team — is where most developers get stuck.
TechPaathshala's Advanced AI Agent Bootcamp is an intensive, hands-on programme for developers who are ready to go beyond tutorials and build real, production-grade agentic AI systems.
In the bootcamp, you will:
- Build three complete agents from scratch — a research agent, a data analysis agent, and a customer support agent — using the architecture covered in this guide, plus advanced patterns including multi-agent orchestration and stateful LangGraph workflows
- Master production engineering practices — proper evaluation frameworks (RAGAS, DeepEval), logging and observability with LangSmith, cost optimisation for high-volume agent deployments, and failure mode analysis
- Work with real Mumbai use cases — BFSI document intelligence, e-commerce customer support automation, developer productivity agents — building the domain context that makes your portfolio stand out
- Get hands-on with multi-agent systems — CrewAI and LangGraph for complex workflows where multiple specialised agents collaborate on tasks no single agent can handle efficiently
- Deploy to production — Docker containerisation, FastAPI integration, AWS Bedrock deployment, and the CI/CD practices that make agent systems maintainable at scale
The bootcamp is for developers with Python proficiency and some exposure to LLMs. No prior agent-building experience required — the curriculum starts from the architecture covered in this guide and builds to production-grade systems over 8 weeks.
👉 Apply for TechPaathshala's Advanced AI Agent Bootcamp — and build the agent engineering skills that Mumbai's top Fintech companies and GCCs are hiring for right now.
TechPaathshala is a Mumbai-based technology education platform helping developers build production-grade AI skills — from Full Stack development to Agentic AI Engineering.

