Learn how to adapt existing software into MCP environments using patterns from the HUD SDK. All examples link to actual implementations in hud-evals/hud-python.

Browser Automation Pattern

The HUD SDK provides powerful computer tools that can control any browser-based application. See how remote_browser implements this pattern.

Using HUD Computer Tools

from hud.tools.computer import HudComputerTool
from hud.tools.executors.base import BaseExecutor

# Browser executor wraps Playwright
class BrowserExecutor(BaseExecutor):
    def __init__(self, playwright_tool):
        self.browser = playwright_tool
    
    async def screenshot(self) -> str:
        # Return base64 screenshot from browser
        return await self.browser.screenshot()
    
    async def click(self, x: int, y: int):
        # Translate to browser coordinates
        await self.browser.click(x, y)

# Use with computer tool
executor = BrowserExecutor(playwright_tool)
computer_tool = HudComputerTool(executor=executor)
See the full implementation:

Multiple Provider Support

# Support different cloud browsers
provider = os.getenv("BROWSER_PROVIDER", "anchorbrowser")
browser_provider = get_provider(provider)

# Each provider implements the same interface
cdp_url = await browser_provider.launch()
browser = await playwright.connect_over_cdp(cdp_url)
See providers directory for implementations.

Game Environment Pattern

Simple stateful applications like games use custom tools. See text_2048 for the pattern.

Custom Tool Implementation

from hud.tools.base import BaseTool
from mcp.types import ContentBlock, TextContent

class MoveTool(BaseTool):
    """Custom tool for game moves."""
    
    def __init__(self, env):
        super().__init__(
            name="move",
            description="Move tiles in a direction",
            parameters={
                "type": "object",
                "properties": {
                    "direction": {"type": "string", "enum": ["up", "down", "left", "right"]}
                },
                "required": ["direction"]
            }
        )
        self.env = env
    
    async def __call__(self, direction: str) -> list[ContentBlock]:
        """Execute the move - this is what BaseTool calls."""
        # Perform the move
        self.env.move(direction)
        
        # Return visual state
        board_display = self.env.get_board_ascii()
        return [TextContent(text=board_display, type="text")]
See MoveTool implementation.

Docker Base Images

HUD provides optimized base images for common use cases:

Browser Automation

FROM hudpython/browser-base:latest
# Includes Chromium, Playwright, and dependencies

# Add your automation code
COPY src/ /app/src/
CMD ["python", "-m", "hud_controller.server"]

Desktop GUI

FROM hudpython/novnc-base:latest
# Ubuntu with XFCE, VNC, and noVNC pre-configured

# Install your GUI application
RUN apt-get update && apt-get install -y \
    your-gui-app

# VNC is available on port 5901, noVNC on 8080

Custom Base Images

When building from scratch:
FROM python:3.11-slim

# Install system dependencies
RUN apt-get update && apt-get install -y \
    required-packages \
    && rm -rf /var/lib/apt/lists/*

# Python environment
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1

WORKDIR /app
See base image examples in environment Dockerfiles.

Environment Context Pattern

Share state between tools using environment context:
# Define context class
class BrowserEnvironmentContext:
    def __init__(self, browser):
        self.browser = browser
        self.page = None
    
    async def goto(self, url: str):
        self.page = await self.browser.new_page()
        await self.page.goto(url)

# Share with tools in server.py
@mcp.initialize
async def initialize_environment(ctx):
    browser = await launch_browser()
    env = BrowserEnvironmentContext(browser)
    
    # Make env available to all tools
    global environment
    environment = env

Common Return Types

All tools should return standardized types for consistency.

ContentBlock for Tool Output

from mcp.types import ContentBlock, TextContent, ImageContent

# Text output
return [TextContent(text="Operation complete", type="text")]

# Image output
return [ImageContent(data=base64_data, mimeType="image/png", type="image")]

# Multiple outputs
return [
    TextContent(text="Screenshot captured", type="text"),
    ImageContent(data=screenshot, mimeType="image/png", type="image")
]

EvaluationResult for Scoring

from hud.tools.types import EvaluationResult

return EvaluationResult(
    reward=0.8,  # Score between 0-1
    done=True,   # Task complete?
    info={       # Additional context
        "score": 80,
        "threshold": 75,
        "details": "Form submitted successfully"
    }
)
See types.py for full definitions.

Tool Discovery and Resources

Expose available functions as MCP resources:
@mcp.resource("setup://functions")
async def list_setup_functions():
    """List all available setup functions."""
    return {
        "functions": [
            {"name": name, "description": func.__doc__}
            for name, func in setup_functions.items()
        ]
    }
See resource examples.

Error Handling Patterns

Consistent error handling across tools:
from mcp import McpError
from mcp.types import INTERNAL_ERROR

try:
    result = await risky_operation()
except SpecificError as e:
    # Return error as content
    return [TextContent(text=f"Operation failed: {e}", type="text")]
except Exception as e:
    # Fatal errors
    raise McpError(INTERNAL_ERROR, f"Unexpected error: {e}")

Next Steps

For step-by-step implementation guide, see the environments README.