Skip to main content
Version 0.4.53 - Latest stable release

What is HUD?

HUD connects AI agents to software environments using the Model Context Protocol (MCP). Whether you’re evaluating existing agents, building new environments, or training models with RL, HUD provides the infrastructure.

Why HUD?

  • 🔌 MCP-native: Any agent can connect to any environment
  • 📡 Live telemetry: Debug every tool call at hud.so
  • 🚀 Production-ready: From local Docker to cloud scale
  • 🎯 Built-in benchmarks: OSWorld-Verified, SheetBench-50, and more
  • 🔧 CLI tools: Create, develop, run, and train with hud init, hud dev, hud run, hud eval, hud rl

3-minute quickstart

Run your first agent evaluation with zero setup

Clone starter project

uvx hud-python quickstart

Quick Example

import asyncio, os, hud
from hud.datasets import Task
from hud.agents import ClaudeAgent

async def main():
    # Define evaluation task with remote MCP
    task = Task(
        prompt="Win a game of 2048 by reaching the 128 tile",
        mcp_config={
            "hud": {
                "url": "https://mcp.hud.so/v3/mcp",
                "headers": {
                    "Authorization": f"Bearer {os.getenv('HUD_API_KEY')}",
                    "Mcp-Image": "hudevals/hud-text-2048:0.1.3"
                }
            }
        },
        setup_tool={"name": "setup", "arguments": {"name": "board", "arguments": { "board_size": 4}}},
        evaluate_tool={"name": "evaluate", "arguments": {"name": "max_number", "arguments": {"target": 64}}}
    )
    
    # Run agent (auto-creates MCP client)
    agent = ClaudeAgent()
    result = await agent.run(task)
    print(f"Score: {result.reward}")

asyncio.run(main())

Community

Are you a startup building agents?

📅 Hop on a call or 📧 founders@hud.so
I