How Agents Work
An agent follows this lifecycle: The agent keeps calling your LLM and executing tools until the LLM stops requesting tools, indicating the task is complete.The Four Required Methods
To create an agent, you implement four methods that bridge your LLM with MCP’s tool system:Understanding When Each Method is Called
The agent loop calls your methods in this sequence:get_system_messages()
- Once at startformat_blocks()
- Converts initial task promptget_response()
- Gets LLM decision, adds assistant message to messagesformat_tool_results()
- After each tool execution- Back to step 3 until done
What MCPAgent Does For You
The Agent Loop
The baseMCPAgent
class handles the entire execution loop. When you call agent.run(task)
:
-
Initialization Phase
- Connects to MCP servers (auto-creates client from task.mcp_config if needed)
- Discovers available tools from all connected servers
- Applies tool filtering (allowed/disallowed lists)
- Identifies lifecycle tools (setup, evaluate, response)
-
Setup Phase (if task.setup_tool provided)
- Executes setup tools (e.g., navigate to website, initialize environment)
- Optionally appends setup output to initial context (controlled by
append_setup_output
) - Can include initial screenshots (controlled by
initial_screenshot
)
-
Main Execution Loop
-
Evaluation Phase (if task.evaluate_tool provided)
- Runs evaluation tools to calculate reward
- Extracts reward from result (looks for “reward”, “grade”, “score” keys)
- Returns Trace object with full execution history
Tool Management
Tool Discovery & Filtering- Available Tools: Retrieved via
self.get_available_tools()
- already filtered - Lifecycle Tools: Automatically detected and hidden from your LLM
- Response Tools: Auto-detected (tools with “response” in name) for task completion
Client Management
MCPAgent handles complex client lifecycle:Error Handling
MCPAgent provides robust error handling:- Connection Errors: Helpful messages about MCP server availability
- Tool Errors: Captured and returned as MCPToolResult with isError=True
- Timeout Handling: Graceful shutdown on tool execution timeouts
- Trace Always Returns: Even on errors, you get a Trace object with details
Message Accumulation
Messages build up over the conversation:get_response()
receives the full conversation history each time, allowing your LLM to maintain context.
Advanced Features
Response Agent IntegrationTesting Your Agent
Test your agent on a simple task:Built-in Agents
HUD provides built-in agents for common LLM providers:Next Steps
See Also
hud eval
- Run agents on tasks/datasets from the CLIhud rl
- Train agents with GRPO on your datasets- Agents (SDK Reference) – API details and built-in agents