mtenv.envs.hipbmdp package¶
Subpackages¶
Submodules¶
mtenv.envs.hipbmdp.dmc_env module¶
-
mtenv.envs.hipbmdp.dmc_env.
build_dmc_env
(domain_name: str, task_name: str, seed: int, xml_file_id: str, visualize_reward: bool, from_pixels: bool, height: int, width: int, frame_skip: int, frame_stack: int, sticky_observation_cfg: Dict[str, Any]) → gym.core.Env[source]¶ Build a single DMC environment as described in [TTM+20].
- Parameters
domain_name (str) – name of the domain.
task_name (str) – name of the task.
seed (int) – environment seed (for reproducibility).
xml_file_id (str) – id of the xml file to use.
visualize_reward (bool) – should visualize reward ?
from_pixels (bool) – return pixel observations?
height (int) – height of pixel frames.
width (int) – width of pixel frames.
frame_skip (int) – should skip frames?
frame_stack (int) – should stack frames together?
sticky_observation_cfg (Dict[str, Any]) – Configuration for using sticky observations. It should be a dictionary with three keys, should_use which specifies if the config should be used, sticky_probability which specifies the probability of choosing a previous task and last_k which specifies the number of previous frames to choose from.
- Returns
- Return type
Env
mtenv.envs.hipbmdp.env module¶
-
mtenv.envs.hipbmdp.env.
build
(domain_name: str, task_name: str, seed: int, xml_file_ids: List[str], visualize_reward: bool, from_pixels: bool, height: int, width: int, frame_skip: int, frame_stack: int, sticky_observation_cfg: Dict[str, Any], initial_task_state: int = 1) → mtenv.core.MTEnv[source]¶ Build multitask environment as described in HiPBMDP paper. See [ZSKP20] for more details.
- Parameters
domain_name (str) – name of the domain.
task_name (str) – name of the task.
seed (int) – environment seed (for reproducibility).
xml_file_ids (List[str]) – ids of xml files.
visualize_reward (bool) – should visualize reward ?
from_pixels (bool) – return pixel observations?
height (int) – height of pixel frames.
width (int) – width of pixel frames.
frame_skip (int) – should skip frames?
frame_stack (int) – should stack frames together?
sticky_observation_cfg (Dict[str, Any]) – Configuration for using sticky observations. It should be a dictionary with three keys, should_use which specifies if the config should be used, sticky_probability which specifies the probability of choosing a previous task and last_k which specifies the number of previous frames to choose from.
initial_task_state (int, optional) – intial task/environment to select. Defaults to 1.
- Returns
- Return type