all AI news
Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding
March 19, 2024, 4:49 a.m. | Chaolei Tan, Jianhuang Lai, Wei-Shi Zheng, Jian-Fang Hu
cs.CV updates on arXiv.org arxiv.org
Abstract: Video Paragraph Grounding (VPG) is an emerging task in video-language understanding, which aims at localizing multiple sentences with semantic relations and temporal order from an untrimmed video. However, existing VPG approaches are heavily reliant on a considerable number of temporal labels that are laborious and time-consuming to acquire. In this work, we introduce and explore Weakly-Supervised Video Paragraph Grounding (WSVPG) to eliminate the need of temporal annotations. Different from previous weakly-supervised grounding frameworks based on …
abstract alignment arxiv cs.cv however labels language language understanding multiple regression relations semantic temporal type understanding video weakly-supervised
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York