minerl.data

The minerl.data package provides a unified interface for sampling data from the MineRL-v0 Dataset. Data is accessed by making a dataset from one of the minerl environments and iterating over it using one of the iterators provided by the minerl.data.DataPipeline

The following is a description of the various methods included within the package as well as some basic usage examples. To see more detailed descriptions and tutorials on how to use the data API, please take a look at our numerous getting started manuals.

MineRLv0

minerl.data.make(environment=None, data_dir=None, num_workers=4, worker_batch_size=32, minimum_size_to_dequeue=32, force_download=False)

Initalizes the data loader with the chosen environment

Parameters
  • environment (string) – desired MineRL environment

  • data_dir (string, optional) – specify alternative dataset location. Defaults to None.

  • num_workers (int, optional) – number of files to load at once. Defaults to 4.

  • force_download (bool, optional) – specifies whether or not the data should be downloaded if missing. Defaults to False.

Returns

initalized data pipeline

Return type

DataPipeline

minerl.data.download(directory=None, resolution='low', texture_pack=0, update_environment_variables=True, disable_cache=False, experiment=None, minimal=False)

Downloads MineRLv0 to specified directory. If directory is None, attempts to download to $MINERL_DATA_ROOT. Raises ValueError if both are undefined.

Parameters
  • directory (os.path) – destination root for downloading MineRLv0 datasets

  • resolution (str, optional) – one of [ ‘low’, ‘high’ ] corresponding to video resolutions of [ 64x64, 256,128 ] respectively (note: high resolution is not currently supported). Defaults to ‘low’.

  • texture_pack (int, optional) – 0: default Minecraft texture pack, 1: flat semi-realistic texture pack. Defaults to 0.

  • update_environment_variables (bool, optional) – enables / disables exporting of MINERL_DATA_ROOT environment variable (note: for some os this is only for the current shell) Defaults to True.

  • disable_cache (bool, optional) – downloads temporary files to local directory. Defaults to False

  • experiment (str, optional) – specify the desired experiment to download. Will only download data for this experiment. Note there is no hash verification for individual experiments

  • minimal (bool, optional) – download a minimal version of the dataset

class minerl.data.DataPipeline(data_directory: <module 'posixpath' from '/usr/lib/python3.6/posixpath.py'>, environment: str, num_workers: int, worker_batch_size: int, min_size_to_dequeue: int, random_seed=42)

Bases: object

Creates a data pipeline object used to itterate through the MineRL-v0 dataset

property action_space

Returns: action space of current MineRL environment

get_trajectory_names()

Gets all the trajectory names

Returns

[description]

Return type

A list of experiment names

load_data(stream_name: str, skip_interval=0, include_metadata=False)

Iterates over an individual trajectory named stream_name.

Parameters
  • stream_name (str) – The stream name desired to be iterated through.

  • skip_interval (int, optional) – How many sices should be skipped.. Defaults to 0.

  • include_metadata (bool, optional) – Whether or not meta data about the loaded trajectory should be included.. Defaults to False.

Yields

A tuple of (state, player_action, reward_from_action, next_state, is_next_state_terminal). These are tuples are yielded in order of the episode.

static map_to_dict(handler_list: list, target_space: <module 'gym.spaces.space' from '/home/wguss/.local/lib/python3.6/site-packages/gym/spaces/space.py'>)
property observation_space

Returns: action space of current MineRL environment

sarsd_iter(num_epochs=-1, max_sequence_len=32, queue_size=None, seed=None, include_metadata=False)

Returns a generator for iterating through (state, action, reward, next_state, is_terminal) tuples in the dataset. Loads num_workers files at once as defined in minerl.data.make() and return up to max_sequence_len consecutive samples wrapped in a dict observation space

Parameters
  • num_epochs (int, optional) – number of epochs to iterate over or -1 to loop forever. Defaults to -1

  • max_sequence_len (int, optional) – maximum number of consecutive samples - may be less. Defaults to 32

  • seed (int, optional) – seed for random directory walk - note, specifying seed as well as a finite num_epochs will cause the ordering of examples to be the same after every call to seq_iter

  • queue_size (int, optional) – maximum number of elements to buffer at a time, each worker may hold an additional item while waiting to enqueue. Defaults to 16*self.number_of_workers or 2* self.number_of_workers if max_sequence_len == -1

  • include_metadata (bool, optional) – adds an additional member to the tuple containing metadata about the stream the data was loaded from. Defaults to False

Yields

A tuple of (state, player_action, reward_from_action, next_state, is_next_state_terminal, (metadata)). Each element is in the format of the environment action/state/reward space and contains as many samples are requested.

seq_iter(num_epochs=-1, max_sequence_len=32, queue_size=None, seed=None, include_metadata=False)

DEPRECATED METHOD FOR SAMPLING DATA FROM THE MINERL DATASET.

This function is now DataPipeline.sarsd_iter()