RECREG

Source: masa/algorithms/tabular/recreg.py

RECREG is the most intervention-oriented tabular algorithm currently in MASA. It combines task learning with an explicit backup policy and uses a safety estimate to decide when to override the task action.

Core Components

The implementation maintains:

  • a task Q-table Q

  • a backup policy table B

  • a safety estimate used to test the task action against a finite-horizon risk threshold

At action time, the algorithm first proposes an action from the task policy. If that action exceeds the current finite-horizon risk threshold, it is replaced by an action from the backup policy.

Supported Modes

RECREG supports three modes for estimating safety:

  • exact: uses the environment’s exact transition model

  • model_based: estimates transitions online from counts and performs model checking on the learned model

  • model_free: learns a finite-horizon unsafe-probability table directly

In model-based mode, the implementation supports exact or statistical model checking.

Safety-Relevant Behaviour

  • overridden risky actions are pushed toward a pessimistic target

  • in model_based and model_free, the backup policy is updated online using cost-aware targets

  • in exact, the backup policy is initialized from value iteration on the exact model

  • override rates are logged during rollout and evaluation

  • DFA-based counterfactual transitions are supported for LTL-safety product environments

When To Use It

Use RECREG when:

  • you want explicit intervention rather than only cost shaping

  • you want a learned backup policy available at decision time

  • you want finite-horizon probabilistic safety checks to gate actions online