Sign-up to participate on [AIcrowd]!

This year, we are adding a new competition to the MineRL family: BASALT, a competition on solving human-judged tasks, with $11,000 in prizes. The tasks in this competition do not have a pre-defined reward function: the goal is to produce trajectories that are judged by real humans to be effective at solving a given task.

We realize this is somewhat uncharted territory for the ML community, and that it will require a different set of norms and training procedures - perhaps integrating demonstrations with sources of live human ranking, rating, or comparison to guide agents in the right direction. Our hope is that this competition can provide an impetus for the research community to build these new procedures, which we expect will become increasingly relevant as we want artificially intelligent systems to integrate into more areas of our lives.

Like the Diamond competition, BASALT provides a set of Gym environments paired with human demonstrations, since methods based on imitation are an important building block for solving hard-to-specify tasks.

The Tasks

FindCave

The agent should search for a cave, and terminate the episode when it is inside one.

MakeWaterfall

After spawning in a mountainous area, the agent should build a beautiful waterfall and then reposition itself to take a scenic picture of the same waterfall.

CreateVillageAnimalPen

After spawning in a village, the agent should build an animal pen containing two of the same kind of animal next to one of the houses in a village.

BuildVillageHouse

Using items in its starting inventory, the agent should build a new house in the style of the village, in an appropriate location (e.g. next to the path through the village), without harming the village in the process.


Competition Overview

All submissions are through AIcrowd. There you can find detailed rules as well as the leaderboard.

drawing

Submission: Submit Trained Agents

Evaluation 1: Leaderboard

Evaluation 2: Final Scores

Validation


Baseline submission

Our baseline is a simple behavioral cloning algorithm trained for a couple of hours. We hope to see participants improve upon it significantly!


Prizes

Thanks to the generosity of our sponsors, there will be $11,000 worth of cash prizes:

In addition, the top three teams will be invited to coauthor the competition report.

Note that as we expect to be unable to evaluate all submissions, prizes may be restricted to entries that reach the second evaluation phase, or the validation phase, at the organizers’ discretion. Prize winners are expected to present their solutions at NeurIPS.

We also have an additional $1,000 worth of prizes for participants who provide support for the competition:

Team

The organizing team consists of:

Advisors:

Sponsors:

Contact

If you have any questions, please feel free to contact us at rohinmshah AT berkeley DOT edu.

Citation

The MineRL BASALT Competition on Learning from Human Feedback

Rohin Shah, Cody Wild, Steven H. Wang, Neel Alex, Brandon Houghton, William Guss, Sharada Mohanty, Anssi Kanervisto, Stephanie Milani, Nicholay Topin, Pieter Abbeel, Stuart Russell, Anca Dragan

NeurIPS 2021 Competition Track

2021

[BibTex] [Competition Details]