Melting Pot¶
DeepMind Melting Pot¶
Melting Pot is a suite of test scenarios for multi-agent reinforcement learning, using 2D game environments.
It assesses generalization to novel social situations (familiar and unfamiliar individuals), and requires social reasoning such as cooperation, competition, deception, reciprocation, trust, and stubbornness.
Shimmy provides compatibility wrappers to convert all Melting Pot environments to PettingZoo.
Installation¶
To install shimmy
and required dependencies:
pip install shimmy[meltingpot]
We also provide a Dockerfile for reproducibility and cross-platform compatibility (see
curl https://raw.githubusercontent.com/Farama-Foundation/Shimmy/main/bin/meltingpot.Dockerfile | docker build -t meltingpot -f - . && docker run -it meltingpot
Warning
Melting Pot does not currently support Windows operating systems.
Usage¶
Load a new meltingpot
environment:
from shimmy import MeltingPotCompatibilityV0
env = MeltingPotCompatibilityV0(substrate_name="prisoners_dilemma_in_the_matrix__arena", render_mode="human")
Wrap an existing meltingpot
environment:
from shimmy import MeltingPotCompatibilityV0
from shimmy.utils.meltingpot import load_meltingpot
env = load_meltingpot("prisoners_dilemma_in_the_matrix__arena")
env = MeltingPotCompatibilityV0(env, render_mode="human")
Note: Using the env
and substrate_name
arguments together will result in a ValueError
.
Use the
env
argument to wrap an existing Melting Pot environment.Use the
substrate_name
argument to load a new Melting Pot environment.
Run the environment:
observations = env.reset()
while env.agents:
actions = {agent: env.action_space(agent).sample() for agent in env.agents}
observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()
Environments are loaded as ParallelEnv
, but can be converted to AECEnv
using PettingZoo Wrappers.
Warning
Melting Pot does not currently support environment seeding.
Class Description¶
- class shimmy.meltingpot_compatibility.MeltingPotCompatibilityV0(env: substrate.Substrate | None = None, substrate_name: str | None = None, max_cycles: int = MAX_CYCLES, render_mode: str | None = None)[source]¶
This compatibility wrapper converts a Melting Pot substrate into a PettingZoo environment.
Due to how the underlying environment is set up, this environment is nondeterministic, so seeding doesn’t work.
Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence. It assesses generalization to novel social situations involving both familiar and unfamiliar individuals, and has been designed to test a broad range of social interactions such as: cooperation, competition, deception, reciprocation, trust, stubbornness and so on. Melting Pot offers researchers a set of over 50 multi-agent reinforcement learning substrates (multi-agent games) on which to train agents, and over 256 unique test scenarios on which to evaluate these trained agents.
Wrapper that converts a Melting Pot environment into a PettingZoo environment.
- Parameters:
env (Optional[substrate.Substrate]) – existing Melting Pot environment to wrap
substrate_name (Optional[str]) – name of Melting Pot substrate to load (instead of existing environment)
max_cycles (Optional[int]) – maximum number of cycles before truncation
render_mode (Optional[str]) – rendering mode
- metadata: dict[str, Any] = {'name': 'MeltingPotCompatibilityV0', 'render_modes': ['human', 'rgb_array']}¶
- PLAYER_STR_FORMAT = 'player_{index}'¶
- MAX_CYCLES = 1000¶
- observation_space(agent: AgentID) Space [source]¶
observation_space.
Get the observation space from the underlying Melting Pot substrate.
- Parameters:
agent (AgentID) – agent
- Returns:
observation_space – spaces.Space
- action_space(agent: AgentID) Space [source]¶
action_space.
Get the action space from the underlying Melting Pot substrate.
- Parameters:
agent (AgentID) – agent
- Returns:
action_space – spaces.Space
- state() ndarray [source]¶
State.
Get an observation of the current environment’s state. Used in rendering.
- Returns:
observation
- reset(seed: int | None = None, options: dict | None = None) tuple[ObsDict, dict[AgentID, Any]] [source]¶
reset.
Resets the environment.
- Parameters:
seed – the seed to reset the environment with (not used, due to nondeterministic underlying environment)
options – the options to reset the environment with
- Returns:
observations
- step(actions: Dict[AgentID, ActionType]) tuple[Dict[AgentID, ObsType], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]] [source]¶
step.
Steps through all agents with one action
- Parameters:
actions – actions to step through the environment with
- Returns:
(observations, rewards, terminations, truncations, infos)
Utils¶
Utility functions for Melting Pot.
- shimmy.utils.meltingpot.load_meltingpot(substrate_name: str)[source]¶
Helper function to load Melting Pot substrates.
- Parameters:
substrate_name – str
- Returns:
env – meltingpot.utils.substrates.substrate.Substrate
- shimmy.utils.meltingpot.timestep_to_observations(timestep: TimeStep) Dict[AgentID, ObsType] [source]¶
Extracts Gymnasium-compatible observations from a Melting Pot timestep.
- Parameters:
timestep – The dm_env timestep
- Returns:
observation, reward, terminated, truncated, info.
- shimmy.utils.meltingpot.remove_world_observations_from_space(observation: Dict) Dict [source]¶
Removes the world observations key from a Gymnasium observation dict.
This is used to limit the information an individual agent has access to (it cannot see the entire world).
- Parameters:
observation – The Melting Pot observation
- Returns:
observation – The Melting Pot observation, without world observations.