Contents Menu Expand Light mode Dark mode Auto light/dark, in light mode Auto light/dark, in dark mode Skip to content
MASA-Safe-RL
Logo
MASA-Safe-RL

Get Started

  • Quick Start
  • Core Concepts
    • Labelling Function
    • Cost Function
  • Basic Usage

Common API

  • Constraints
    • Constrained Markov Decision Process (CMDP)
    • Linear Temporal Logic (LTL) Safety Constraint
    • Probabilistic Computation Tree Logic (PCTL) Constraint
    • Step-wise Probabilistic Constraint
    • Reach-avoid Constraint
    • Multi-Agent Constraints
      • Constrained Markov Game (CMG)
  • Wrappers
    • Core Wrappers
    • Misc Wrappers
    • Vectorized Envs
  • Metrics
    • Logging
  • Linear Temporal Logic (LTL)
    • Propositional Formula
    • DFA
    • Cost Function as a DFA
    • Shaped Cost Function
  • Probabilistic Computation Tree Logic (PCTL)

Environments

  • Multi Agent
    • Gridworlds
      • Markov Stag Hunt
    • Matrix Games
      • Bertrand
      • Chicken
      • Congestion
      • Dynamic Public Goods Game
      • Inspection
  • Single Agent
    • Cartpole
    • Safety Gridworlds
      • Island Navigation
      • Conveyor Belt
      • Sokoban
    • Pacman
    • Gridworlds
      • Bridge Crossing
      • Colour Grid World
      • Colour Bomb Grid World
    • Media Streaming

Algorithms

  • Algorithms Overview
  • Tabular Algorithms
    • Q Learning
    • Q Learning Lambda
    • LCRL
    • SEM
    • RECREG
  • On-Policy Algorithms
    • A2C
    • PPO
    • PPO Lagrangian
    • Constrained Policy Optimization
  • Shielded Algorithms
    • Parameterized PPO
    • Parameterized PPO V2

Tutorials

  • Basics
  • Constraints
  • LTL-Safety
  • Wrappers
  • Environments
  • Algorithms

Misc

  • Probabilistic Shielding
Back to top
View this page
Edit this page

On-Policy Algorithms¶

This section covers MASA’s on-policy actor-critic methods. The currently implemented and registered algorithms in this part of the codebase are A2C and PPO.

  • A2C
  • PPO
  • PPO Lagrangian
  • Constrained Policy Optimization
Next
A2C
Previous
RECREG
Copyright © 2025, Alexander Goodall, Omar Adalat, Edwin Hamel De Le Court, Francesco Belardinelli
Made with Sphinx and @pradyunsg's Furo