Jan. 17, 2022, 2:10 a.m. | Daizong Liu, Xiaoye Qu, Yinzhen Wang, Xing Di, Kai Zou, Yu Cheng, Zichuan Xu, Pan Zhou

cs.LG updates on arXiv.org arxiv.org

Temporal video grounding (TVG) aims to localize a target segment in a video
according to a given sentence query. Though respectable works have made decent
achievements in this task, they severely rely on abundant video-query paired
data, which is expensive and time-consuming to collect in real-world scenarios.
In this paper, we explore whether a video grounding model can be learned
without any paired annotations. To the best of our knowledge, this paper is the
first work trying to address TVG …

arxiv clustering cv semantic unsupervised video

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Analyst

@ Rappi | COL-Bogotá

Applied Scientist II

@ Microsoft | Redmond, Washington, United States