Define Tasks for Training

Tasks format

HUD tasksets can be provided in two primary formats (both supported):

A single JSON file containing a list of task objects (recommended)

[
  {
    "id": "browser_2048_128",
    "prompt": "Reach 128 in 2048.",
    "mcp_config": {
      "hud": {
        "url": "https://mcp.hud.so/v3/mcp",
        "headers": {
          "Authorization": "Bearer ${HUD_API_KEY}",
          "Mcp-Image": "hudevals/hud-browser:0.1.3"
        }
      }
    },
    "setup_tool": {"name": "launch_app", "arguments": {"app_name": "2048"}},
    "evaluate_tool": {"name": "evaluate", "arguments": {"name": "game_2048_max_number", "arguments": {"target": 128}}}
  }
]

Save as basic-2048.json and run:

hud eval basic-2048.json
hud rl basic-2048.json

JSONL file with one task object per line

prompt: instruction for the agent
mcp_config: where to run the environment (local docker or remote MCP)
setup_tool (optional): a tool call to prepare the environment
evaluate_tool: a tool call to compute reward
system_prompt (optional): extra guidance for the agent

Minimal JSONL example

{"id": "browser_2048_128", "prompt": "Reach 128 in 2048.", "mcp_config": {"hud": {"url": "https://mcp.hud.so/v3/mcp", "headers": {"Authorization": "Bearer ${HUD_API_KEY}", "Mcp-Image": "hudevals/hud-browser:0.1.3"}}}, "setup_tool": {"name": "launch_app", "arguments": {"app_name": "2048"}}, "evaluate_tool": {"name": "evaluate", "arguments": {"name": "game_2048_max_number", "arguments": {"target": 128}}}}

Save as basic-2048.jsonl and run:

hud eval basic-2048.jsonl
hud rl basic-2048.jsonl

Hosting on HuggingFace

You can host tasksets on the Hub and fetch them with:

hud get hud-evals/basic-2048

The command downloads the JSONL task file and places it in your project directory.

Tips

Keep tasks self-contained; use setup_tool to open apps or load data
Ensure evaluate_tool returns a numeric reward per episode
Use small task counts to iterate quickly; scale up once stable

Get Started

Core Concepts

Evaluate Agents

Build Environments

Train Agents

CLI Reference

SDK Reference

Define Tasks for Training

Tasks format

Minimal JSONL example

Hosting on HuggingFace

Tips

Get Started

Core Concepts

Evaluate Agents

Build Environments

Train Agents

CLI Reference

SDK Reference

​Tasks format

​Minimal JSONL example

​Hosting on HuggingFace

​Tips

Tasks format

Minimal JSONL example

Hosting on HuggingFace

Tips