DM Control#

DeepMind Control#

DM Control is a framework for physics-based simulation and reinforcement learning environments using the MuJoCo physics engine.

Shimmy provides compatibility wrappers to convert Control Suite environments and custom Locomotion environments to Gymnasium.

DeepMind Locomotion


To install shimmy and required dependencies:

pip install shimmy[dm-control]

We also provide a Dockerfile for reproducibility and cross-platform compatibility:

curl | docker build -t dm_control -f - . && docker run -it dm_control


Load a dm_control environment:

import gymnasium as gym

env = gym.make("dm_control/acrobot-swingup_sparse-v0", render_mode="human")

Run the environment:

observation, info = env.reset(seed=42)
for _ in range(1000):
   action = env.action_space.sample()  # this is where you would insert your policy
   observation, reward, terminated, truncated, info = env.step(action)

   if terminated or truncated:
      observation, info = env.reset()

To get a list of all available dm_control environments (85 total):

from gymnasium.envs.registration import registry
    for env_id in registry
    if env_id.startswith("dm_control") and env_id != "dm_control/compatibility-env-v0"

Class Description#

class shimmy.dm_control_compatibility.DmControlCompatibilityV0(env: composer.Environment | control.Environment | dm_env.Environment, render_mode: str | None = None, render_kwargs: dict[str, Any] | None = None)[source]#

This compatibility wrapper converts a dm-control environment into a gymnasium environment.

Dm-control is DeepMind’s software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo physics.

Dm-control actually has two Environments classes, dm_control.composer.Environment and dm_control.rl.control.Environment that while both inherit from dm_env.Environment, they differ in implementation.

For environment in dm_control.suite are dm-control.rl.control.Environment while dm-control locomotion and manipulation environments use dm-control.composer.Environment.

This wrapper supports both Environment class through determining the base environment type.


dm-control uses np.random.RandomState, a legacy random number generator while gymnasium uses np.random.Generator, therefore the return type of np_random is different from expected.

Initialises the environment with a render mode along with render information.

Note: this wrapper supports multi-camera rendering via the render_mode argument (render_mode = “multi_camera”)

For more information on DM Control rendering, see

  • env (Optional[composer.Environment | control.Environment | dm_env.Environment]) – DM Control env to wrap

  • render_mode (Optional[str]) – rendering mode (options: “human”, “rgb_array”, “depth_array”, “multi_camera”)

  • render_kwargs (Optional[dict[str, Any]]) – Additional keyword arguments for rendering. For the width, height and camera id use “width”, “height” and “camera_id” respectively. See the dm_control implementation for the list of possible kwargs, Note: kwargs are not used for human rendering, which uses simpler Gymnasium MuJoCo rendering.

metadata: dict[str, Any] = {'render_fps': 10, 'render_modes': ['human', 'rgb_array', 'depth_array', 'multi_camera']}#
observation_space: spaces.Space[ObsType]#
action_space: spaces.Space[ActType]#
property dt#

Returns the environment control timestep which is equivalent to the number of actions per second.

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[ObsType, dict[str, Any]][source]#

Resets the dm-control environment.

step(action: ndarray) tuple[ObsType, float, bool, bool, dict[str, Any]][source]#

Steps through the dm-control environment.

render() np.ndarray | None[source]#

Renders the dm-control env.


Closes the environment.

property np_random: RandomState#

This should be np.random.Generator but dm-control uses np.random.RandomState.