Derpanis, KonstantinosThasarathan, Harrish Patrick2024-03-182024-03-182024-03-16https://hdl.handle.net/10315/41895Modelling human motion is critical for computer vision tasks that aim to perceive human behaviour. Extending current learning-based approaches to successfully model long-term motions remains a challenge. Recent works rely on autoregressive methods, in which motions are modelled sequentially. These methods tend to accumulate errors, and when applied to typical motion modelling tasks, are limited up to only four seconds. We present a non-autoregressive framework to represent motion sequences as a set of learned key-frames without explicit supervision. We explore continuous and discrete generative frameworks for this task and design a key-framing transformer architecture to distill a motion sequence into key-frames and their relative placements in time. We validate our learned key-frame placement approach with a naive uniform placement strategy and further compare key-frame distillation using our transformer architecture with an alternative common sequence modelling approach. We demonstrate the effectiveness of our method by reconstructing motions up to 12 seconds.Author owns copyright, except where explicitly noted. Please contact the author directly with licensing requests.Computer scienceKey-Frame Based Motion Representations for Pose SequencesElectronic Thesis or Dissertation2024-03-16Computer visionDeep learningGenerative modellingMotion modelling