Reach-avoid Constraint

Overview

Reach-avoid constraint monitor.

A reach-avoid property requires eventually reaching a target set while never visiting an unsafe set. Using atomic propositions:

  • reach_label indicates a target condition,

  • avoid_label indicates an unsafe condition.

A typical reach-avoid specification can be described as:

\[(\neg\mathsf{avoid})\ \mathcal{U}\ \mathsf{reach}\]

i.e., “avoid is never true until reach becomes true” (informally).

This implementation tracks:

  • reached: whether reach_label has been observed at least once,

  • violated: whether avoid_label has been observed at least once,

  • satisfied: whether the property has been satisfied so far.

API Reference

class masa.common.constraints.reach_avoid.ReachAvoid(reach_label: str)[source]

Bases: Constraint

Reach target label set while avoiding unsafe label set.

At each step, given a label set labels:

  • reaching condition:

reach = (reach_label in labels) - avoiding condition: avoid_ok = (avoid_label not in labels)

State updates:

  • reached becomes true once reach is observed,

  • violated becomes true once avoid is violated,

  • satisfied becomes true once reached is true and violated is false.

Parameters:
  • avoid_label – Atomic proposition name indicating unsafe/avoid condition.

  • reach_label – Atomic proposition name indicating the target condition.

Variables:
  • avoid_label – Name of unsafe label.

  • reach_label – Name of target label.

  • reached – Whether target has been reached at least once.

  • violated – Whether unsafe has been observed at least once.

  • satisfied – Whether reach-avoid has been satisfied so far.

reset()[source]

Reset episode flags.

update(labels: Iterable[str])[source]

Update reach/avoid flags from the current label set.

Parameters:

labels – Iterable of atomic propositions for the current step.

episode_metric() Dict[str, float][source]

End-of-episode metrics.

Returns:

  • "reached": whether the target was ever reached,

  • "violated": whether unsafe was ever visited,

  • "satisfied": 1.0 if satisfied else 0.0.

Return type:

Dict containing

step_metric() Dict[str, float][source]

Per-step metrics.

Returns:

  • "cost": 1.0 if avoid is violated at this step else 0.0,

  • "violation": 1.0 if avoid violated else 0.0,

  • "reached": 1.0 if reach holds at this step else 0.0.

Return type:

Dict containing

property constraint_type: str

"reach_avoid".

Type:

Stable identifier string

class masa.common.constraints.reach_avoid.ReachAvoidEnv(env: gym.Env, avoid_label: str = 'unsafe', reach_label: str = 'target', **kw)[source]

Bases: BaseConstraintEnv

Gymnasium wrapper for the ReachAvoid monitor.

Parameters:
  • env – Base environment (must be a LabelledEnv).

  • avoid_label – Atomic proposition name for unsafe/avoid condition.

  • reach_label – Atomic proposition name for target condition.

  • **kw – Extra keyword arguments forwarded to BaseConstraintEnv.

Initialize the wrapper.

Parameters:
  • env – Base environment. Must already be wrapped as a LabelledEnv so that step/reset provide label sets in info["labels"].

  • constraint – A constraint monitor implementing Constraint.

  • **kw – Unused extra keyword arguments (kept for wrapper compatibility).

Raises:

TypeError – If env is not a LabelledEnv.