RECREG¶
Source: masa/algorithms/tabular/recreg.py
RECREG is the most intervention-oriented tabular algorithm currently in MASA. It combines task learning with an explicit backup policy and uses a safety estimate to decide when to override the task action.
Core Components¶
The implementation maintains:
a task Q-table
Qa backup policy table
Ba safety estimate used to test the task action against a finite-horizon risk threshold
At action time, the algorithm first proposes an action from the task policy. If that action exceeds the current finite-horizon risk threshold, it is replaced by an action from the backup policy.
Supported Modes¶
RECREG supports three modes for estimating safety:
exact: uses the environment’s exact transition modelmodel_based: estimates transitions online from counts and performs model checking on the learned modelmodel_free: learns a finite-horizon unsafe-probability table directly
In model-based mode, the implementation supports exact or statistical model checking.
Safety-Relevant Behaviour¶
overridden risky actions are pushed toward a pessimistic target
in
model_basedandmodel_free, the backup policy is updated online using cost-aware targetsin
exact, the backup policy is initialized from value iteration on the exact modeloverride rates are logged during rollout and evaluation
DFA-based counterfactual transitions are supported for LTL-safety product environments
When To Use It¶
Use RECREG when:
you want explicit intervention rather than only cost shaping
you want a learned backup policy available at decision time
you want finite-horizon probabilistic safety checks to gate actions online