System Overview
Core Components
1. Agents (hud.agents
)
Agents make decisions and call tools:
Agents can auto-create MCP clients from
task.mcp_config
- no manual client setup needed2. Tasks (hud.Task
)
Tasks define what agents should accomplish:
The
name
and arguments
in setup/evaluate tools correspond exactly to the tool names and parameters exposed by the MCP server3. MCP Clients (hud.clients
)
Clients handle the MCP protocol:
4. Environments
Environments are MCP servers exposing tools:5. Telemetry (hud.trace
)
Real-time observability:
Execution Flow
1
Task Definition
Create a
Task
with prompt and MCP configuration2
Agent Initialization
Agent creates MCP client (if needed) and connects to environment
3
Setup Phase
Execute
setup_tool
to initialize environment state4
Execution Loop
Agent receives observations, makes decisions, calls tools
5
Evaluation
Execute
evaluate_tool
to score performance6
Telemetry
All interactions streamed to HUD backend for analysis
Key Design Principles
- Protocol-First: Everything speaks MCP
- Composable: Mix and match agents, environments, evaluations
- Observable: Built-in telemetry for every interaction
- Testable: Reproducible evaluations with Docker
- Extensible: Easy to add new agents or environments
The
MCPServer
class wraps FastMCP with lifecycle management, making it easy to build Docker-based environments