Constrained Markov Game (CMG)¶
Overview¶
Constraint monitors for labelled PettingZoo parallel environments.
This module provides a constrained Markov game monitor based on cumulative
cost budgets. Each agent receives its own label set through
infos[agent]["labels"] and incurs a step cost via cost_fn(labels).
Budgets are then evaluated over subsets of agents:
where \(\mathcal{G}_k\) is the subset of agents assigned to budget \(k\). Budgets may overlap, so the same agent cost can contribute to multiple budget totals.
- class masa.common.constraints.multi_agent.cmg.Budget(amount: float, agents: tuple[str, ...], name: str | None = None)[source]¶
Bases:
objectShared cumulative-cost budget over a subset of agents.
- Parameters:
amount – Maximum allowed cumulative cost for this budget.
agents – Subset of agents from
env.possible_agentscovered by the budget.name – Optional metric prefix. If omitted, a generated name is used.
Notes
Agent memberships are deduplicated while preserving order. Budgets may overlap, so a single agent may contribute to more than one budget.
- class masa.common.constraints.multi_agent.cmg.ConstrainedMarkovGame(possible_agents: Sequence[str], budgets: Sequence[Budget], cost_fn: Callable[[Iterable[str]], float] = dummy_cost_fn)[source]¶
Bases:
objectCumulative-cost monitor for a labelled parallel PettingZoo environment.
- class masa.common.constraints.multi_agent.cmg.ConstrainedMarkovGameEnv(env: ParallelEnv, budgets: Sequence[Budget], cost_fn: Callable[[Iterable[str]], float] = dummy_cost_fn, **kw: Any)[source]¶
Bases:
ParallelEnvPettingZoo parallel wrapper that updates a
ConstrainedMarkovGame.- reset(seed: int | None = None, options: dict[str, Any] | None = None)[source]¶
Reset the wrapped env and seed the constraint from initial agent labels.
- state()[source]¶
Returns the state.
State returns a global view of the environment appropriate for centralized training decentralized execution methods like QMIX
- render()[source]¶
Displays a rendered frame from the environment, if supported.
Alternate render modes in the default environments are ‘rgb_array’ which returns a numpy array and is supported by all environments outside of classic, and ‘ansi’ which returns the strings printed (specific to classic environments).