Note: Project ongoing

Left: During training, Alice will propose progressively more challenging tasks for Bob to complete. Right: On the target task, Bob must navigate to the specified point.


We outline a method for training agents on nagivation tasks in realistic 3D environments. Specifically, we are training agents on pointgoal navigation and object navigation tasks, as described in the Habitat Challenge. The agent uses asymmetric self-play [1] to develop a curriculum of progressively more challenging navigation goals. The agent is them fine-tuned to the navigation datasets provided by Habitat. As described in [1], the agent has two "minds"; one called Alice, which proposes progressively more challenging tasks for the other, called Bob. There is also a target task, which in our case is a navigation task, that Bob is fine-tuned on once the self-play training is complete.

Code




References

[1] Sainbayar Sukhbaatar et al. 2018. Intrinsic Motivation And Automatic Curricula Via Asymmetric Self-Play. arXiv: 1703.05407. Retrieved from https://arxiv.org/pdf/1703.05407.pdf