The hud rl command trains an agent with GRPO on tasks, locally or via the HUD remote service.

Usage

hud rl [TASKS_FILE|DATASET] [MODEL] [OPTIONS]

Arguments

tasks_file
string
Path to tasks JSON/JSONL file or HuggingFace dataset name. If omitted, looks for a tasks file in the current directory.
model
string
Model to train (default: interactive selection)

Options

--config
string
Path to existing configuration file. Short: -c
--output-dir
string
default:"/checkpoints"
Output directory for checkpoints. Short: -o
--restart
boolean
default:"false"
Restart the vLLM server before training
--verbose
boolean
default:"false"
Enable verbose output. Short: -v
--no-ddp
boolean
default:"false"
Disable DistributedDataParallel (even with multiple GPUs)
--ddp-gpus
string
Specific GPUs for DDP (e.g., 0,1,2,3)
--vllm-gpu
integer
Specific GPU for vLLM server
--local
boolean
default:"false"
Run training locally instead of the remote HUD server

Behavior

  • If no tasks file is provided, an interactive picker helps locate one.
  • Remote mode (default) converts tasks to remote MCP automatically (build/push as needed) and launches remote training.
  • Local mode runs training on your machine (delegated to local_runner).

Examples

# Remote (default): auto-convert tasks to remote, then train
hud rl tasks.json --model claude-rl

# Local training with GPU selection
hud rl tasks.json llama3.1 --local --ddp-gpus 0,1 --vllm-gpu 0

# Use a dataset directly (remote)
hud rl hud-evals/SheetBench-50 --model claude-rl

See Also