Advanced Usage

Custom spaces

Vectorized environments will batch actions and observations if they are elements from standard Gym spaces, such as Box, Discrete, or Dict. If you create your own environment with a custom action and/or observation space though (inheriting from gym.Space), the vectorized environment will not attempt to automatically batch the actions/observations, and instead it will return the raw tuple of elements from all sub-environments.

In the following example, we created a new environment SMILESEnv, whose observations are strings representing the SMILES notation of a molecular structure, with a custom observation space SMILES. The observations returned by the vectorized environment is a tuple of strings.

>>> class SMILES(gym.Space):
...     def __init__(self, symbols):
...         super().__init__()
...         self.symbols = symbols
...
...     def __eq__(self, other):
...         return self.symbols == other.symbols

>>> class SMILESEnv(gym.Env):
...     observation_space = SMILES("][()CO=")
...     action_space = gym.spaces.Discrete(7)
...
...     def reset(self):
...         self._state = "["
...         return self._state
...
...     def step(self, action):
...         self._state += self.observation_space.symbols[action]
...         reward = done = (action == 0)
...         return (self._state, float(reward), done, {})

>>> envs = gym.vector.AsyncVectorEnv(
...     [lambda: SMILESEnv()] * 3,
...     shared_memory=False
... )
>>> envs.reset()
>>> observations, rewards, dones, infos = envs.step(np.array([2, 5, 4]))
>>> observations
('[(', '[O', '[C')

Warning

Custom observation & action spaces may inherit from the gym.Space class. However, most use-cases should be covered by the existing space classes (e.g. Box, Discrete, etc…), and container classes (Tuple & Dict). Moreover, some implementations of Reinforcement Learning algorithms might not handle custom spaces properly. Use custom spaces with care.

Warning

If you use AsyncVectorEnv with a custom observation space, you must set shared_memory=False, since shared memory and automatic batching is not compatible with custom spaces. In general if you use custom spaces with AsyncVectorEnv, the elements of those spaces must be pickleable.