Environment Spec

If your Docker image starts a stdio MCP server on container startup, it is a hud environment.

Architecture

The MCP server can be written in any language or framework.
Your software can be anything (service, desktop app, game, scripts). The connection between them is your choice.
Logs must go to stderr; stdout is reserved for MCP protocol packets.

Minimal requirements

Docker image starts a process that speaks MCP on stdout/stdin (stdio) by default.
No non‑MCP output on stdout (all logging to stderr).
No required file layout, framework, or endpoints.

Recommended (for HUD RL/evals): provide tools named setup and evaluate.

Make it runnable remotely (mcp.hud.so)

Remote execution is built‑in. Push your image, then either:

CLI (default is remote):

export HUD_API_KEY=...
hud push
hud run myorg/my-env:latest   # remote container with MCP

Programmatic (Task config):

{
  "mcp_config": {
    "hud": {
      "url": "https://mcp.hud.so/v3/mcp",
      "headers": {
        "Authorization": "Bearer ${HUD_API_KEY}",
        "Mcp-Image": "myorg/my-env:latest"
      }
    }
  }
}

Run arbitrary configs locally or remotely

Local Docker (stdio):

hud run myorg/my-env:latest --local --transport stdio

Local HTTP proxy (for inspectors):

hud run myorg/my-env:latest --local --transport http --port 8765

Remote (default):

hud run myorg/my-env:latest

Notes

Use hud dev only for local hot‑reload workflows; for production, build once and hud run.
Use hud debug only when startup/compliance issues occur.
Any MCP‑speaking Docker image works; controller/backend split is optional.

Task config examples

from hud.datasets import Task

task = Task(
    prompt="...",
    mcp_config={
        "hud": {
            "url": "https://mcp.hud.so/v3/mcp",
            "headers": {
                "Authorization": "Bearer ${HUD_API_KEY}",
                "Mcp-Image": "myorg/my-env:latest"
            }
        }
    },
)

The same structure is used by hud init’s template and by programmatic tasks.

Programmatic (inline in code):

{
  "mcp_config": {
    "hud": {
      "url": "https://mcp.hud.so/v3/mcp",
      "headers": {
        "Authorization": "Bearer ${HUD_API_KEY}",
        "Mcp-Image": "myorg/my-env:latest"
      }
    }
  },
  "agent_tools": ["act"],
  "setup_tool": { "name": "setup", "arguments": {} },
  "evaluate_tool": { "name": "evaluate", "arguments": {"target": 10} }
}

File tasks.json (created by hud init templates), local dev variant:

[
  {
    "prompt": "Increment the counter to reach 10",
    "mcp_config": {
      "local": {
        "command": "docker",
        "args": ["run", "--rm", "-i", "my-env:latest"]
      }
    },
    "agent_tools": ["act"],
    "setup_tool": { "name": "setup", "arguments": {} },
    "evaluate_tool": { "name": "evaluate", "arguments": {"target": 10} }
  }
]

Switching this file to remote is as simple as replacing the mcp_config with the hud section shown above (or using hud rl, which will help convert it automatically). Run tasks with either the CLI or an agent:

# Evaluate tasks with an agent backend
hud eval tasks.json --agent claude

or programmatically by passing the Task to your agent (e.g., ClaudeAgent, OpenAIChatAgent).

Get Started

Ideas

Environments

RL

Agents

CLI Reference

SDK Reference

Environment Spec

Architecture

Minimal requirements

Make it runnable remotely (mcp.hud.so)

Run arbitrary configs locally or remotely

Notes

Task config examples

Get Started

Ideas

Environments

RL

Agents

CLI Reference

SDK Reference

​Architecture

​Minimal requirements

​Make it runnable remotely (mcp.hud.so)

​Run arbitrary configs locally or remotely

​Notes

​Task config examples

Architecture

Minimal requirements

Make it runnable remotely (mcp.hud.so)

Run arbitrary configs locally or remotely

Notes

Task config examples