Web: http://arxiv.org/abs/2201.10168

Jan. 26, 2022, 2:10 a.m. | Sangmin Woo, Jinyoung Park, Inyong Koo, Sumin Lee, Minki Jeong, Changick Kim

cs.CV updates on arXiv.org arxiv.org

We present a new paradigm named explore-and-match for video grounding, which
aims to seamlessly unify two streams of video grounding methods: proposal-based
and proposal-free. To achieve this goal, we formulate video grounding as a set
prediction problem and design an end-to-end trainable Video Grounding
Transformer (VidGTR) that can utilize the architectural strengths of rich
contextualization and parallel decoding for set prediction. The overall
training is balanced by two key losses that play different roles, namely span
localization loss and set …

arxiv cv transformer video

More from arxiv.org / cs.CV updates on arXiv.org

Data Architect – Public Sector Health Data Architect, WWPS

@ Amazon.com | US, VA, Virtual Location - Virginia

[Job 8224] Data Engineer - Developer Senior

@ CI&T | Brazil

Software Engineer, Machine Learning, Planner/Behavior Prediction

@ Nuro, Inc. | Mountain View, California (HQ)

Lead Data Scientist

@ Inspectorio | Ho Chi Minh City, Ho Chi Minh City, Vietnam - Remote

Data Engineer

@ Craftable | Portugal - Remote

Sr. Data Scientist, Ads Marketplace Analytics

@ Reddit | Remote - United States