DM Control

DeepMind Control

DM Control is a framework for physics-based simulation and reinforcement learning environments using the MuJoCo physics engine.

Shimmy provides compatibility wrappers to convert Control Suite environments and custom Locomotion environments to Gymnasium.

DeepMind Locomotion

Installation

To install shimmy and required dependencies:

pip install shimmy[dm-control]

We also provide a Dockerfile for reproducibility and cross-platform compatibility:

curl https://raw.githubusercontent.com/Farama-Foundation/Shimmy/main/bin/dm_control.Dockerfile | docker build -t dm_control -f - . && docker run -it dm_control

Usage

Load a dm_control environment:

import gymnasium as gym

env = gym.make("dm_control/acrobot-swingup_sparse-v0", render_mode="human")

Run the environment:

observation, info = env.reset(seed=42)
for _ in range(1000):
   action = env.action_space.sample()  # this is where you would insert your policy
   observation, reward, terminated, truncated, info = env.step(action)

   if terminated or truncated:
      observation, info = env.reset()
env.close()

To get a list of all available dm_control environments (85 total):

from gymnasium.envs.registration import registry
DM_CONTROL_ENV_IDS = [
    env_id
    for env_id in registry
    if env_id.startswith("dm_control") and env_id != "dm_control/compatibility-env-v0"
]
print(DM_CONTROL_ENV_IDS)

Class Description

class shimmy.dm_control_compatibility.DmControlCompatibilityV0(env: composer.Environment | control.Environment | dm_env.Environment, render_mode: str | None = None, render_kwargs: dict[str, Any] | None = None)[source]

This compatibility wrapper converts a dm-control environment into a gymnasium environment.

Dm-control is DeepMind’s software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics.

Dm-control actually has two Environments classes, dm_control.composer.Environment and dm_control.rl.control.Environment that while both inherit from dm_env.Environment, they differ in implementation.

For environment in dm_control.suite are dm-control.rl.control.Environment while dm-control locomotion and manipulation environments use dm-control.composer.Environment.

This wrapper supports both Environment class through determining the base environment type.

Note

dm-control uses np.random.RandomState, a legacy random number generator while gymnasium uses np.random.Generator, therefore the return type of np_random is different from expected.

Initialises the environment with a render mode along with render information.

Note: this wrapper supports multi-camera rendering via the render_mode argument (render_mode = “multi_camera”)

For more information on DM Control rendering, see https://github.com/deepmind/dm_control/blob/main/dm_control/mujoco/engine.py#L178

Parameters:
  • env (Optional[composer.Environment | control.Environment | dm_env.Environment]) – DM Control env to wrap

  • render_mode (Optional[str]) – rendering mode (options: “human”, “rgb_array”, “depth_array”, “multi_camera”)

  • render_kwargs (Optional[dict[str, Any]]) – Additional keyword arguments for rendering. For the width, height and camera id use “width”, “height” and “camera_id” respectively. See the dm_control implementation for the list of possible kwargs, https://github.com/deepmind/dm_control/blob/330c91f41a21eacadcf8316f0a071327e3f5c017/dm_control/mujoco/engine.py#L178 Note: kwargs are not used for human rendering, which uses simpler Gymnasium MuJoCo rendering.

metadata: dict[str, Any] = {'render_fps': 10, 'render_modes': ['human', 'rgb_array', 'depth_array', 'multi_camera']}
property dt

Returns the environment control timestep which is equivalent to the number of actions per second.

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]][source]

Resets the dm-control environment.

step(action: ndarray) tuple[ObsType, float, bool, bool, dict[str, Any]][source]

Steps through the dm-control environment.

render() np.ndarray | None[source]

Renders the dm-control env.

close()[source]

Closes the environment.

property np_random: RandomState

This should be np.random.Generator but dm-control uses np.random.RandomState.