DM Control#

DeepMind Control#

DM Control is a framework for physics-based simulation and reinforcement learning environments using the MuJoCo physics engine.

Shimmy provides compatibility wrappers to convert Control Suite environments and custom Locomotion environments to Gymnasium.

DeepMind Locomotion

Installation#

To install shimmy and required dependencies:

pip install shimmy[dm-control]

We also provide a Dockerfile for reproducibility and cross-platform compatibility:

curl https://raw.githubusercontent.com/Farama-Foundation/Shimmy/main/bin/dm_control.Dockerfile | docker build -t dm_control -f - . && docker run -it dm_control

Usage#

Load a dm_control environment:

import gymnasium as gym

env = gym.make("dm_control/acrobot-swingup_sparse-v0", render_mode="human")

Run the environment:

observation, info = env.reset(seed=42)
for _ in range(1000):
   action = env.action_space.sample()  # this is where you would insert your policy
   observation, reward, terminated, truncated, info = env.step(action)

   if terminated or truncated:
      observation, info = env.reset()
env.close()

To get a list of all available dm_control environments (85 total):

from gymnasium.envs.registration import registry
DM_CONTROL_ENV_IDS = [
    env_id
    for env_id in registry
    if env_id.startswith("dm_control") and env_id != "dm_control/compatibility-env-v0"
]
print(DM_CONTROL_ENV_IDS)

Class Description#

class shimmy.dm_control_compatibility.DmControlCompatibilityV0(env: composer.Environment | control.Environment | dm_env.Environment, render_mode: str | None = None, render_kwargs: dict[str, Any] | None = None)[source]#

This compatibility wrapper converts a dm-control environment into a gymnasium environment.

Dm-control is DeepMind’s software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics.

Dm-control actually has two Environments classes, dm_control.composer.Environment and dm_control.rl.control.Environment that while both inherit from dm_env.Environment, they differ in implementation.

For environment in dm_control.suite are dm-control.rl.control.Environment while dm-control locomotion and manipulation environments use dm-control.composer.Environment.

This wrapper supports both Environment class through determining the base environment type.

Note

dm-control uses np.random.RandomState, a legacy random number generator while gymnasium uses np.random.Generator, therefore the return type of np_random is different from expected.

Initialises the environment with a render mode along with render information.

Note: this wrapper supports multi-camera rendering via the render_mode argument (render_mode = “multi_camera”)

For more information on DM Control rendering, see https://github.com/deepmind/dm_control/blob/main/dm_control/mujoco/engine.py#L178

Parameters:
  • env (Optional[composer.Environment | control.Environment | dm_env.Environment]) – DM Control env to wrap

  • render_mode (Optional[str]) – rendering mode (options: “human”, “rgb_array”, “depth_array”, “multi_camera”)

  • render_kwargs (Optional[dict[str, Any]]) – Additional keyword arguments for rendering. For the width, height and camera id use “width”, “height” and “camera_id” respectively. See the dm_control implementation for the list of possible kwargs, https://github.com/deepmind/dm_control/blob/330c91f41a21eacadcf8316f0a071327e3f5c017/dm_control/mujoco/engine.py#L178 Note: kwargs are not used for human rendering, which uses simpler Gymnasium MuJoCo rendering.

metadata: dict[str, Any] = {'render_fps': 10, 'render_modes': ['human', 'rgb_array', 'depth_array', 'multi_camera']}#
observation_space: spaces.Space[ObsType]#
action_space: spaces.Space[ActType]#
property dt#

Returns the environment control timestep which is equivalent to the number of actions per second.

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]][source]#

Resets the dm-control environment.

step(action: ndarray) tuple[ObsType, float, bool, bool, dict[str, Any]][source]#

Steps through the dm-control environment.

render() np.ndarray | None[source]#

Renders the dm-control env.

close()[source]#

Closes the environment.

property np_random: RandomState#

This should be np.random.Generator but dm-control uses np.random.RandomState.