Key-Frame Based Motion Representations for Pose Sequences
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Modelling human motion is critical for computer vision tasks that aim to perceive human behaviour. Extending current learning-based approaches to successfully model long-term motions remains a challenge. Recent works rely on autoregressive methods, in which motions are modelled sequentially. These methods tend to accumulate errors, and when applied to typical motion modelling tasks, are limited up to only four seconds. We present a non-autoregressive framework to represent motion sequences as a set of learned key-frames without explicit supervision. We explore continuous and discrete generative frameworks for this task and design a key-framing transformer architecture to distill a motion sequence into key-frames and their relative placements in time. We validate our learned key-frame placement approach with a naive uniform placement strategy and further compare key-frame distillation using our transformer architecture with an alternative common sequence modelling approach. We demonstrate the effectiveness of our method by reconstructing motions up to 12 seconds.