Vector#
- gym.vector.make(id: str, num_envs: int = 1, asynchronous: bool = True, wrappers: Optional[Union[callable, List[callable]]] = None, disable_env_checker: Optional[bool] = None, **kwargs) VectorEnv #
Create a vectorized environment from multiple copies of an environment, from its id.
Example:
>>> import gym >>> env = gym.vector.make('CartPole-v1', num_envs=3) >>> env.reset() array([[-0.04456399, 0.04653909, 0.01326909, -0.02099827], [ 0.03073904, 0.00145001, -0.03088818, -0.03131252], [ 0.03468829, 0.01500225, 0.01230312, 0.01825218]], dtype=float32)
- Parameters:
id – The environment ID. This must be a valid ID from the registry.
num_envs – Number of copies of the environment.
asynchronous – If True, wraps the environments in an
AsyncVectorEnv
(which uses `multiprocessing`_ to run the environments in parallel). IfFalse
, wraps the environments in aSyncVectorEnv
.wrappers – If not
None
, then apply the wrappers to each internal environment during creation.disable_env_checker – If to run the env checker for the first environment only. None will default to the environment spec disable_env_checker parameter (that is by default False), otherwise will run according to this argument (True = not run, False = run)
**kwargs – Keywords arguments applied during gym.make
- Returns:
The vectorized environment.
VectorEnv#
- gym.vector.VectorEnv.action_space#
The (batched) action space. The input actions of step must be valid elements of action_space.:
>>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.action_space MultiDiscrete([2 2 2])
- gym.vector.VectorEnv.observation_space#
The (batched) observation space. The observations returned by reset and step are valid elements of observation_space.:
>>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.observation_space Box([[-4.8 ...]], [[4.8 ...]], (3, 4), float32)
- gym.vector.VectorEnv.single_action_space#
The action space of an environment copy.:
>>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Discrete(2)
- gym.vector.VectorEnv.single_observation_space#
The observation space of an environment copy.:
>>> envs = gym.vector.make("CartPole-v1", num_envs=3) >>> envs.single_action_space Box([-4.8 ...], [4.8 ...], (4,), float32)
Reset#
- VectorEnv.reset(*, seed: Optional[Union[int, List[int]]] = None, options: Optional[dict] = None)#
Reset all parallel environments and return a batch of initial observations.
- Parameters:
seed – The environment reset seeds
options – If to return the options
- Returns:
A batch of observations from the vectorized environment.
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset()
(array([[-0.02240574, -0.03439831, -0.03904812, 0.02810693],
[ 0.01586068, 0.01929009, 0.02394426, 0.04016077],
[-0.01314174, 0.03893502, -0.02400815, 0.0038326 ]],
dtype=float32), {})
Step#
- VectorEnv.step(actions)#
Take an action for each parallel environment.
- Parameters:
actions – element of
action_space
Batch of actions.- Returns:
Batch of (observations, rewards, terminated, truncated, infos) or (observations, rewards, dones, infos)
>>> envs = gym.vector.make("CartPole-v1", num_envs=3)
>>> envs.reset()
>>> actions = np.array([1, 0, 1])
>>> observations, rewards, dones, infos = envs.step(actions)
>>> observations
array([[ 0.00122802, 0.16228443, 0.02521779, -0.23700266],
[ 0.00788269, -0.17490888, 0.03393489, 0.31735462],
[ 0.04918966, 0.19421194, 0.02938497, -0.29495203]],
dtype=float32)
>>> rewards
array([1., 1., 1.])
>>> dones
array([False, False, False])
>>> infos
{}