Web: http://arxiv.org/abs/2201.05675

Jan. 24, 2022, 2:11 a.m. | John Ridley, Huseyin Coskun, David Joseph Tan, Nassir Navab, Federico Tombari

cs.LG updates on arXiv.org arxiv.org

The video action segmentation task is regularly explored under weaker forms
of supervision, such as transcript supervision, where a list of actions is
easier to obtain than dense frame-wise labels. In this formulation, the task
presents various challenges for sequence modeling approaches due to the
emphasis on action transition points, long sequence lengths, and frame
contextualization, making the task well-posed for transformers. Given
developments enabling transformers to scale linearly, we demonstrate through
our architecture how they can be applied to …

arxiv cv segmentation transformers

