mtenv package¶

Subpackages¶

Submodules¶

mtenv.core module¶

Core API of MultiTask Environments for Reinforcement Learning.

class mtenv.core.MTEnv(action_space: gym.spaces.space.Space, env_observation_space: gym.spaces.space.Space, task_observation_space: gym.spaces.space.Space)[source]¶

Bases: gym.core.Env, abc.ABC

Main class for multitask RL Environments.

This abstract class extends the OpenAI Gym environment and adds support for return the task-specific information from the environment. The observation returned from the single task environments is encoded as env_obs (environment observation) while the task specific observation is encoded as the task_obs (task observation). The observation returned by mtenv is a dictionary of env_obs and task_obs. Since this class extends the OpenAI gym, the mtenv API looks similar to the gym API.

import mtenv
env = mtenv.make('xxx')
env.reset()

Any multitask RL environment class should extend/implement this class.

Parameters

action_space (Space) –
env_observation_space (Space) –
task_observation_space (Space) –

assert_env_seed_is_set() → None[source]¶

Check that seed (for the environment) is set.

reset function should invoke this function before resetting the environment (for reproducibility).

assert_task_seed_is_set() → None[source]¶

Check that seed (for the task) is set.

sample_task_state function should invoke this function before sampling a new task state (for reproducibility).

get_task_obs() → Union[str, int, float, numpy.ndarray][source]¶

Get the current value of task observation.

Environment returns task observation everytime we call step or reset. This function is useful when the user wants to access the task observation without acting in (or resetting) the environment.

Returns
Return type: TaskObsType

abstract get_task_state() → Any[source]¶

Return all the information needed to execute the current task again.

This function is useful when we want to set the environment to a previous task.

Returns: For more information on task_state, refer Task State.
Return type: TaskStateType

abstract reset() → Dict[str, Union[numpy.ndarray, str, int, float]][source]¶

Reset the environment to some initial state and return the observation in the new state.

The subclasses, extending this class, should ensure that the environment seed is set (by calling seed(int)) before invoking this method (for reproducibility). It can be done by invoking self.assert_env_seed_is_set().

Returns: For more information on multitask observation returned by the environment, refer MultiTask Observation.
Return type: ObsType

reset_task_state() → None[source]¶

Sample a new task_state and set the environment to that task_state.

For more information on task_state, refer Task State.

abstract sample_task_state() → Any[source]¶

Sample a task_state.

task_state contains all the information that the environment needs to switch to any other task.

The subclasses, extending this class, should ensure that the task seed is set (by calling seed(int)) before invoking this method (for reproducibility). It can be done by invoking self.assert_task_seed_is_set().

Returns: For more information on task_state, refer Task State.
Return type: TaskStateType

seed(seed: Optional[int] = None) → List[int][source]¶

Set the seed for the environment’s random number generator.

Invoke seed_task to set the seed for the task’s random number generator.

Parameters: seed (Optional[int], optional) – Defaults to None.
Returns: Returns the list of seeds used in the environment’s random number generator. The first value in the list should be the seed that should be passed to this method for reproducibility.
Return type: List[int]

seed_task(seed: Optional[int] = None) → List[int][source]¶

Set the seed for the task’s random number generator.

Invoke seed to set the seed for the environment’s random number generator.

Parameters: seed (Optional[int], optional) – Defaults to None.
Returns: Returns the list of seeds used in the task’s random number generator. The first value in the list should be the seed that should be passed to this method for reproducibility.
Return type: List[int]

abstract set_task_state(task_state: Any) → None[source]¶

Reset the environment to a particular task.

task_state contains all the information that the environment needs to switch to any other task.

Parameters: task_state (TaskStateType) – For more information on task_state, refer Task State.

abstract step(action: Union[str, int, float, numpy.ndarray]) → Tuple[Dict[str, Union[numpy.ndarray, str, int, float]], float, bool, Dict[str, Any]][source]¶

Execute the action in the environment.

Parameters: action (ActionType) –
Returns: Tuple of multitask observation, reward, done, and info. For more information on multitask observation returned by the environment, refer MultiTask Observation.
Return type: StepReturnType

Module contents¶

class mtenv.MTEnv(action_space: gym.spaces.space.Space, env_observation_space: gym.spaces.space.Space, task_observation_space: gym.spaces.space.Space)[source]¶

Bases: gym.core.Env, abc.ABC

Main class for multitask RL Environments.

This abstract class extends the OpenAI Gym environment and adds support for return the task-specific information from the environment. The observation returned from the single task environments is encoded as env_obs (environment observation) while the task specific observation is encoded as the task_obs (task observation). The observation returned by mtenv is a dictionary of env_obs and task_obs. Since this class extends the OpenAI gym, the mtenv API looks similar to the gym API.

import mtenv
env = mtenv.make('xxx')
env.reset()

Any multitask RL environment class should extend/implement this class.

Parameters

action_space (Space) –
env_observation_space (Space) –
task_observation_space (Space) –

assert_env_seed_is_set() → None[source]¶

Check that seed (for the environment) is set.

reset function should invoke this function before resetting the environment (for reproducibility).

assert_task_seed_is_set() → None[source]¶

Check that seed (for the task) is set.

sample_task_state function should invoke this function before sampling a new task state (for reproducibility).

get_task_obs() → Union[str, int, float, numpy.ndarray][source]¶

Get the current value of task observation.

Environment returns task observation everytime we call step or reset. This function is useful when the user wants to access the task observation without acting in (or resetting) the environment.

Returns
Return type: TaskObsType

abstract get_task_state() → Any[source]¶

Return all the information needed to execute the current task again.

This function is useful when we want to set the environment to a previous task.

Returns: For more information on task_state, refer Task State.
Return type: TaskStateType

abstract reset() → Dict[str, Union[numpy.ndarray, str, int, float]][source]¶

Reset the environment to some initial state and return the observation in the new state.

The subclasses, extending this class, should ensure that the environment seed is set (by calling seed(int)) before invoking this method (for reproducibility). It can be done by invoking self.assert_env_seed_is_set().

Returns: For more information on multitask observation returned by the environment, refer MultiTask Observation.
Return type: ObsType

reset_task_state() → None[source]¶

Sample a new task_state and set the environment to that task_state.

For more information on task_state, refer Task State.

abstract sample_task_state() → Any[source]¶

Sample a task_state.

task_state contains all the information that the environment needs to switch to any other task.

The subclasses, extending this class, should ensure that the task seed is set (by calling seed(int)) before invoking this method (for reproducibility). It can be done by invoking self.assert_task_seed_is_set().

Returns: For more information on task_state, refer Task State.
Return type: TaskStateType

seed(seed: Optional[int] = None) → List[int][source]¶

Set the seed for the environment’s random number generator.

Invoke seed_task to set the seed for the task’s random number generator.

Parameters: seed (Optional[int], optional) – Defaults to None.
Returns: Returns the list of seeds used in the environment’s random number generator. The first value in the list should be the seed that should be passed to this method for reproducibility.
Return type: List[int]

seed_task(seed: Optional[int] = None) → List[int][source]¶

Set the seed for the task’s random number generator.

Invoke seed to set the seed for the environment’s random number generator.

Parameters: seed (Optional[int], optional) – Defaults to None.
Returns: Returns the list of seeds used in the task’s random number generator. The first value in the list should be the seed that should be passed to this method for reproducibility.
Return type: List[int]

abstract set_task_state(task_state: Any) → None[source]¶

Reset the environment to a particular task.

task_state contains all the information that the environment needs to switch to any other task.

Parameters: task_state (TaskStateType) – For more information on task_state, refer Task State.

abstract step(action: Union[str, int, float, numpy.ndarray]) → Tuple[Dict[str, Union[numpy.ndarray, str, int, float]], float, bool, Dict[str, Any]][source]¶

Execute the action in the environment.

Parameters: action (ActionType) –
Returns: Tuple of multitask observation, reward, done, and info. For more information on multitask observation returned by the environment, refer MultiTask Observation.
Return type: StepReturnType

mtenv.make(id: str, **kwargs: Any) → gym.core.Env[source]¶