Melting Pot

DeepMind Melting Pot

Melting Pot is a suite of test scenarios for multi-agent reinforcement learning, using 2D game environments.

It assesses generalization to novel social situations (familiar and unfamiliar individuals), and requires social reasoning such as cooperation, competition, deception, reciprocation, trust, and stubbornness.

Shimmy provides compatibility wrappers to convert all Melting Pot environments to PettingZoo.

../../_images/meltingpot.gif

Installation

To install shimmy and required dependencies:

pip install shimmy[meltingpot]

We also provide a Dockerfile for reproducibility and cross-platform compatibility (see

Installation)

curl https://raw.githubusercontent.com/Farama-Foundation/Shimmy/main/bin/meltingpot.Dockerfile | docker build -t meltingpot -f - . && docker run -it meltingpot

Warning

Melting Pot does not currently support Windows operating systems.

Usage

Load a new meltingpot environment:

from shimmy import MeltingPotCompatibilityV0

env = MeltingPotCompatibilityV0(substrate_name="prisoners_dilemma_in_the_matrix__arena", render_mode="human")

Wrap an existing meltingpot environment:

from shimmy import MeltingPotCompatibilityV0
from shimmy.utils.meltingpot import load_meltingpot

env = load_meltingpot("prisoners_dilemma_in_the_matrix__arena")
env = MeltingPotCompatibilityV0(env, render_mode="human")

Note: Using the env and substrate_name arguments together will result in a ValueError.

  • Use the env argument to wrap an existing Melting Pot environment.

  • Use the substrate_name argument to load a new Melting Pot environment.

Run the environment:

observations = env.reset()
while env.agents:
    actions = {agent: env.action_space(agent).sample() for agent in env.agents}
    observations, rewards, terminations, truncations, infos = env.step(actions)
env.close()

Environments are loaded as ParallelEnv, but can be converted to AECEnv using PettingZoo Wrappers.

Warning

Melting Pot does not currently support environment seeding.

Class Description

class shimmy.meltingpot_compatibility.MeltingPotCompatibilityV0(env: substrate.Substrate | None = None, substrate_name: str | None = None, max_cycles: int = MAX_CYCLES, render_mode: str | None = None)[source]

This compatibility wrapper converts a Melting Pot substrate into a PettingZoo environment.

Due to how the underlying environment is set up, this environment is nondeterministic, so seeding doesn’t work.

Melting Pot is a research tool developed to facilitate work on multi-agent artificial intelligence. It assesses generalization to novel social situations involving both familiar and unfamiliar individuals, and has been designed to test a broad range of social interactions such as: cooperation, competition, deception, reciprocation, trust, stubbornness and so on. Melting Pot offers researchers a set of over 50 multi-agent reinforcement learning substrates (multi-agent games) on which to train agents, and over 256 unique test scenarios on which to evaluate these trained agents.

Wrapper that converts a Melting Pot environment into a PettingZoo environment.

Parameters:
  • env (Optional[substrate.Substrate]) – existing Melting Pot environment to wrap

  • substrate_name (Optional[str]) – name of Melting Pot substrate to load (instead of existing environment)

  • max_cycles (Optional[int]) – maximum number of cycles before truncation

  • render_mode (Optional[str]) – rendering mode

metadata: dict[str, Any] = {'name': 'MeltingPotCompatibilityV0', 'render_modes': ['human', 'rgb_array']}
PLAYER_STR_FORMAT = 'player_{index}'
MAX_CYCLES = 1000
observation_space(agent: AgentID) Space[source]

observation_space.

Get the observation space from the underlying Melting Pot substrate.

Parameters:

agent (AgentID) – agent

Returns:

observation_space – spaces.Space

action_space(agent: AgentID) Space[source]

action_space.

Get the action space from the underlying Melting Pot substrate.

Parameters:

agent (AgentID) – agent

Returns:

action_space – spaces.Space

state() ndarray[source]

State.

Get an observation of the current environment’s state. Used in rendering.

Returns:

observation

reset(seed: int | None = None, options: dict | None = None) tuple[ObsDict, dict[AgentID, Any]][source]

reset.

Resets the environment.

Parameters:
  • seed – the seed to reset the environment with (not used, due to nondeterministic underlying environment)

  • options – the options to reset the environment with

Returns:

observations

step(actions: Dict[AgentID, ActionType]) tuple[Dict[AgentID, ObsType], dict[str, float], dict[str, bool], dict[str, bool], dict[str, dict]][source]

step.

Steps through all agents with one action

Parameters:

actions – actions to step through the environment with

Returns:

(observations, rewards, terminations, truncations, infos)

close()[source]

close.

Closes the environment.

render() None | np.ndarray[source]

render.

Renders the environment.

Returns:

The rendering of the environment, depending on the render mode

Utils

Utility functions for Melting Pot.

shimmy.utils.meltingpot.load_meltingpot(substrate_name: str)[source]

Helper function to load Melting Pot substrates.

Parameters:

substrate_name – str

Returns:

env – meltingpot.utils.substrates.substrate.Substrate

shimmy.utils.meltingpot.timestep_to_observations(timestep: TimeStep) Dict[AgentID, ObsType][source]

Extracts Gymnasium-compatible observations from a Melting Pot timestep.

Parameters:

timestep – The dm_env timestep

Returns:

observation, reward, terminated, truncated, info.

shimmy.utils.meltingpot.remove_world_observations_from_space(observation: Dict) Dict[source]

Removes the world observations key from a Gymnasium observation dict.

This is used to limit the information an individual agent has access to (it cannot see the entire world).

Parameters:

observation – The Melting Pot observation

Returns:

observation – The Melting Pot observation, without world observations.