Motion Guided Token Compression for Efficient Masked Video Modeling | allainews.com

March 1, 2024, 5:46 a.m. | Yukun Feng, Yangming Shi, Fengze Liu, Tan Yan

cs.CV updates on arXiv.org arxiv.org

arXiv:2402.18577v1 Announce Type: new
Abstract: Recent developments in Transformers have achieved notable strides in enhancing video comprehension. Nonetheless, the O($N^2$) computation complexity associated with attention mechanisms presents substantial computational hurdles when dealing with the high dimensionality of videos. This challenge becomes particularly pronounced when striving to increase the frames per second (FPS) to enhance the motion capturing capabilities. Such a pursuit is likely to introduce redundancy and exacerbate the existing computational limitations. In this paper, we initiate by showcasing the …

abstract arxiv attention attention mechanisms challenge complexity compression computation computational cs.ai cs.cv dimensionality modeling per token transformers type video videos

More from arxiv.org / cs.CV updates on arXiv.org

Neural Bounding 1 day, 3 hours ago | arxiv.org

arxiv cs.cv cs.gr replace +1

Shape of my heart: Cardiac models through learned signed distance functions 1 day, 3 hours ago | arxiv.org

abstract advanced arxiv challenges +18

Spatial and Modal Optimal Transport for Fast Cross-Modal MRI Reconstruction 1 day, 3 hours ago | arxiv.org

abstract analysis arxiv clinical +21

Learning Keypoints for Robotic Cloth Manipulation using Synthetic Data 1 day, 3 hours ago | arxiv.org

abstract arxiv clothes cs.cv +14

Pixel-Level Change Detection Pseudo-Label Learning for Remote Sensing Change Captioning 1 day, 3 hours ago | arxiv.org

arxiv captioning change cs.cv +5

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation 1 day, 3 hours ago | arxiv.org

arxiv augmentation cs.cv diffusion +3

CLIP in Medical Imaging: A Comprehensive Survey 1 day, 3 hours ago | arxiv.org

arxiv clip cs.cv imaging +5

SARA: Controllable Makeup Transfer with Spatial Alignment and Region-Adaptive Normalization 1 day, 3 hours ago | arxiv.org

abstract alignment applications arxiv +15

Salient Object Detection in RGB-D Videos 1 day, 3 hours ago | arxiv.org

arxiv cs.cv detection object +4

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

GCP Data Engineer

@ Avant Digital | Delhi, DL, India

View on ai-jobs.net