Reach-avoid Constraint¶
Overview¶
Reach-avoid constraint monitor.
A reach-avoid property requires eventually reaching a target set while never visiting an unsafe set. Using atomic propositions:
reach_labelindicates a target condition,avoid_labelindicates an unsafe condition.
A typical reach-avoid specification can be described as:
i.e., “avoid is never true until reach becomes true” (informally).
This implementation tracks:
reached: whetherreach_labelhas been observed at least once,violated: whetheravoid_labelhas been observed at least once,satisfied: whether the property has been satisfied so far.
API Reference¶
- class masa.common.constraints.reach_avoid.ReachAvoid(reach_label: str)[source]¶
Bases:
ConstraintReach target label set while avoiding unsafe label set.
At each step, given a label set
labels:reaching condition:
reach = (reach_label in labels)- avoiding condition:avoid_ok = (avoid_label not in labels)State updates:
reachedbecomes true once reach is observed,violatedbecomes true once avoid is violated,satisfiedbecomes true once reached is true and violated is false.
- Parameters:
avoid_label – Atomic proposition name indicating unsafe/avoid condition.
reach_label – Atomic proposition name indicating the target condition.
- Variables:
avoid_label – Name of unsafe label.
reach_label – Name of target label.
reached – Whether target has been reached at least once.
violated – Whether unsafe has been observed at least once.
satisfied – Whether reach-avoid has been satisfied so far.
- update(labels: Iterable[str])[source]¶
Update reach/avoid flags from the current label set.
- Parameters:
labels – Iterable of atomic propositions for the current step.
- episode_metric() Dict[str, float][source]¶
End-of-episode metrics.
- Returns:
"reached": whether the target was ever reached,"violated": whether unsafe was ever visited,"satisfied": 1.0 if satisfied else 0.0.
- Return type:
Dict containing
- step_metric() Dict[str, float][source]¶
Per-step metrics.
- Returns:
"cost": 1.0 if avoid is violated at this step else 0.0,"violation": 1.0 if avoid violated else 0.0,"reached": 1.0 if reach holds at this step else 0.0.
- Return type:
Dict containing
- property constraint_type: str¶
"reach_avoid".- Type:
Stable identifier string
- class masa.common.constraints.reach_avoid.ReachAvoidEnv(env: gym.Env, avoid_label: str = 'unsafe', reach_label: str = 'target', **kw)[source]¶
Bases:
BaseConstraintEnvGymnasium wrapper for the
ReachAvoidmonitor.- Parameters:
env – Base environment (must be a
LabelledEnv).avoid_label – Atomic proposition name for unsafe/avoid condition.
reach_label – Atomic proposition name for target condition.
**kw – Extra keyword arguments forwarded to
BaseConstraintEnv.
Initialize the wrapper.
- Parameters:
env – Base environment. Must already be wrapped as a
LabelledEnvso that step/reset provide label sets ininfo["labels"].constraint – A constraint monitor implementing
Constraint.**kw – Unused extra keyword arguments (kept for wrapper compatibility).
- Raises:
TypeError – If
envis not aLabelledEnv.