Bridge Crossing
MASA provides two bridge-style tabular gridworlds:
bridge_crossing
bridge_crossing_v2
Both environments share the same interface and reward-cost structure, but use different lava layouts.
Shared Gridworld Conventions
Both environments are single-agent tabular gridworlds with explicit stochastic transition models. They expose a full transition
matrix via get_transition_matrix().
They use the standard gridworld action convention:
0: move left
1: move right
2: move down
3: move up
4: stay in place
When slip is enabled, the intended action is taken with high probability and the remaining probability mass is spread uniformly over
the other actions.
Shared Mechanics
The two variants are 20 x 20 gridworlds with:
a fixed start state in the lower-left corner,
a goal region occupying the top seven rows,
a lava region in the middle of the map,
slip probability 0.04.
They both use:
observation space Discrete(400),
action space Discrete(5),
labels {"start"}, {"goal"}, and {"lava"},
cost 1.0 on "lava" and 0.0 otherwise.
Reward is sparse:
1.0 on goal states,
0.0 elsewhere.
Episodes terminate immediately when the agent reaches either a goal state or a lava state.
bridge_crossing
This is the canonical narrow-bridge layout. The lava occupies the left eight columns and right nine columns of rows 8:12,
leaving a three-cell-wide bridge through the middle.
bridge_crossing_v2
This variant keeps the same interface and dynamics but changes the lava geometry. Here the lava fills most of rows 8:12 from
columns 2:16, plus one extra cell on the lower-left edge of the hazard.
It is useful when you want a second bridge-style benchmark without changing the overall problem class.