HUD Documentation — Evaluations and RL Environments.

Build MCP environments that wrap any software for agent interaction. Think of it in three phases: Phase 1: Environment - Wrap software in MCP tools Phase 2: Tasks - Define evaluation scenarios Phase 3: Agents - Run evaluations and training

Phase 1 · Create a project (2 min)

# Pick a template: blank, deep-research, browser
hud init my-env
cd my-env

Start development servers:

# Terminal 1 - Environment backend
cd environment && uv run uvicorn server:app --reload

# Terminal 2 - MCP server  
cd server && uv run hud dev

Edit-save-test flow

Open server/tools.py, add or tweak a tool.
Save – the mcp restarts instantly.
Visit http://localhost:8765/docs to test tools/

Phase 2 · Write Tasks (2 min)

Build your environment image first (in the global folder):

hud build

Create tasks.json using docker run:

{
  "prompt": "Complete task",
  "mcp_config": {
    "local": {
      "command": "docker",
      "args": ["run", "--rm", "-i", "my-env:0.1.0"]
    }
  },
  ...your setup and evaluation tools
}

See Task System or the hud init README for details.

Phase 3: Run Agents

# Test with agents
hud eval tasks.json

# Deploy to registry
hud push

# Train agents on your tasks
hud rl tasks.json

Cheatsheet

Action	Command
Create env	`hud init my-env -p blank`
Hot-reload dev	`hud dev --build`
Interactive test	`hud dev --interactive`
Troubleshoot	`hud debug my-env:dev`
Build image	`hud build`
Push to registry	`hud push`
RL training	`hud rl tasks.json`

Learn more →

Blank template walkthrough: environments/blank/README.md in the repo.
Technical spec: /build-environments/spec
CLI reference: hud init • hud dev • hud build • hud push • hud run

Have fun – and remember: stderr for logs, stdout for MCP!

Get Started

Ideas

Environments

RL

Agents

CLI Reference

SDK Reference

Build Environments

Phase 1 · Create a project (2 min)

Edit-save-test flow

Phase 2 · Write Tasks (2 min)

Phase 3: Run Agents

Cheatsheet

Learn more →

Get Started

Ideas

Environments

RL

Agents

CLI Reference

SDK Reference

​Phase 1 · Create a project (2 min)

​Edit-save-test flow

​Phase 2 · Write Tasks (2 min)

​Phase 3: Run Agents

​Cheatsheet

​Learn more →

Phase 1 · Create a project (2 min)

Edit-save-test flow

Phase 2 · Write Tasks (2 min)

Phase 3: Run Agents

Cheatsheet

Learn more →