Question-Answering
A specialized environment for question-answering tasks
QA Environment
Introduction
The qa
environment is a specialized, non-interactive environment designed for question-answering tasks. The agent receives context or a question via the Task.prompt
and is expected to provide a final text response.
Setup
No environment-specific setup actions are typically required for qa
tasks. The question or context is provided directly in the Task.prompt
.
Refer to Task Setup Configuration for the general concept of how setup steps could be defined in a Task, although they are generally not needed for qa
.
Step Interaction
Agents interact with the qa
environment primarily by submitting their final answer.
- The agent receives the
Task.prompt
in the initialObservation
. - The agent processes the prompt and determines its answer.
- The agent sends a single
ResponseAction
containing the answer text toenv.step()
.
The environment stores the text from the first ResponseAction
it receives in an internal env.final_response
attribute for evaluation.
Other CLAs: While technically part of the CLA standard, other actions (like ClickAction
, TypeAction
, ScrollAction
, etc.) are not processed or relevant in the standard qa
environment.
Evaluate
Evaluation logic is defined in the evaluate
attribute of the Task and triggered by env.evaluate()
. This logic compares the env.final_response
(the text submitted by the agent via ResponseAction
) against expected criteria.
Common evaluation methods for qa
tasks:
response_includes(substring: str | list[str])
: Checks if the response text contains the specifiedsubstring
or all of the substrings in the provided list.response_is(expected_text: str)
: Checks for an exact, case-sensitive match with theexpected_text
.response_match(pattern: str)
: Checks if the response text matches the provided regular expressionpattern
.
Note: The exact names and availability of evaluation functions might evolve. Refer to specific evaluator documentation or examples for the most current details.
Refer to Task Evaluation Configuration for more details on defining evaluation logic.