SDK reference for task configuration and dataset utilities
Task
class for defining agent objectives and dataset utilities for managing task collections.
Field | Type | Description | Default |
---|---|---|---|
id | str | None | Unique identifier (UUID recommended) | None |
prompt | str | Task instruction for the agent | Required |
mcp_config | dict[str, Any] | MCP server configuration | Required |
setup_tool | MCPToolCall | list[MCPToolCall] | None | Tool(s) to prepare environment | None |
evaluate_tool | MCPToolCall | list[MCPToolCall] | None | Tool(s) to score performance | None |
system_prompt | str | None | Additional system prompt | None |
metadata | dict[str, Any] | Extra task metadata | {} |
mcp_config
field automatically resolves environment variables using ${VAR_NAME}
syntax:
Template.substitute()
with a defaultdict that returns empty strings for missing variables.
mcp_config
and metadata
can be JSON stringssetup_tool
and evaluate_tool
dicts are convertedParameter | Type | Description | Default |
---|---|---|---|
name | str | Job name for tracking | Required |
dataset | str | Dataset | list[dict] | HF dataset ID, Dataset object, or task dicts | Required |
agent_class | type[MCPAgent] | Agent class to instantiate | Required |
agent_config | dict[str, Any] | None | Constructor kwargs for agent | None |
max_concurrent | int | Maximum parallel tasks | 50 |
metadata | dict[str, Any] | None | Job metadata | None |
max_steps | int | Max steps per task | 40 |
split | str | Dataset split when loading by ID | "train" |
auto_respond | bool | Use ResponseAgent for continuations | False |
custom_system_prompt | str | None | Override dataset system prompt | None |
list[Trace]
- Results in dataset order
Features:
hud.job()
system_prompt.txt
system_prompt.txt
from a HuggingFace dataset repository.
Returns: str | None
- System prompt text if found
Note: Requires huggingface_hub
to be installed.
Parameter | Type | Description | Default |
---|---|---|---|
tasks | list[dict[str, Any]] | Task dictionaries (NOT Task objects) | Required |
repo_id | str | HuggingFace repository ID | Required |
**kwargs | Any | Additional args for push_to_hub() | - |
mcp_config
→ JSON stringsetup_tool
→ JSON string (if present)evaluate_tool
→ JSON string (if present)metadata
→ JSON string (if present)Field | Type | Description | Default |
---|---|---|---|
name | str | Tool name to call | Required |
arguments | dict[str, Any] | Tool arguments | {} |
examples/run_evaluation.py
:
environments/text_2048/2048_taskconfigs.json
: