Obtaining whole-body bipedal locomotion controllers that produce motions in the style of a human remains a challenging problem. To produce human-like locomotion, the robot must not only stabilize itself during movement but also execute complex whole-body motions that, while not necessarily aiding stability, enhance the expressiveness and energy efficiency of motion. Reinforcement learning has shown promise in learning locomotion controllers for dynamic motions, however, developing locomotion controllers for these stylized motions and transferring them to real hardware remains an open challenge. In this work, we introduce a framework for learning such a controller by leveraging Adversarial Motion Priors (AMP). Unlike other methods for stylized locomotion, AMP has the benefit of easily allowing for multiple motions, but often suffers from mode collapse to a single motion. To address this challenge, we adopt a Multi-Task RL setup, and demonstrate the added benefit of learning multiple skills for policy robustness. We validate the effectiveness of our method through extensive benchmarking on multiple bipedal humanoid models. Moreover, our controller generates energy-efficient motions without explicit optimization for energy consumption, transitions smoothly between different motions, and is deployable on real humanoid robots, demonstrating the first multi-skill locomotion control policy capable of handling diverse stylized humanoid motions.
Walk
Jog
Cat-Walk
Walk
/>Jog
Cat-Walk
Walk
Jog
Cat-Walk
Cat-Walk
Walk
Jog
Walk
Jog
Unlike common motion tracking rewards that match per-timestep motion data, AMP rewards match the distribution of policy data to that of reference data, allowing for smooth, natural-looking transitions between motions.
Walk → Jog
Jog → Walk
Jog (increasing speed)