Learning Stylized Humanoid Locomotion with Adversarial Motion Priors

1Stanford University, 2University of California, Berkeley, 3Simon Fraser University, 4DeepMind
*Full author list coming soon...
*Full paper coming soon...

Abstract

Obtaining controllers for high-dimensional continuous control of humanoid robots remains a challenging problem. Reinforcement learning has shown promise as a general method for learning locomotion controllers across a diverse range of different motions and in different environments, however specifying a reward function and transferring to real hardware remain open challenges. In this work, we build upon Adversarial Motion Priors (AMP) and introduce a framework that learns a reward function using a small amount of motion capture data. Using this reward function, we are able to learn a single multi-task controller for humanoid locomotion that produces natural-looking behavior for several different motions. Moreover, our controller produces energy-efficient motions without explicitly optimizing for such terms, is able to transition smoothly between different motions, and is capable of deploying on real hardware using a custom domain randomization procedure.

Natural-Looking Skills

Using Adverarial Motion Priors (AMP) allows the robot to perform skills in the style of a human. A single multi-task policy learns all skills.

Cat-Walking

Walking

Jogging

Skill Interpolation

Unlike prior motion-tracking objectives, the GAN objective in AMP allows for smooth, natural-looking transitions between motions.

Walk → Jog

Jog → Walk

Jog (increasing speed)

Acknowledgements

We thank Bike Zhang in the Hybrid Robotics Lab at UC Berkeley for extensive help with setting up our hardware experiments. This project was partially completed during Michael's internship at Google Brain Robotics (now DeepMind), where he was supported by Jie Tan. Michael is supported by the National Science Defense and Engineering Graduate (NDSEG) Fellowship under the Office of Naval Research (ONR).