all AI news
Correlation-guided Query-Dependency Calibration in Video Representation Learning for Temporal Grounding
April 2, 2024, 7:49 p.m. | WonJun Moon, Sangeek Hyun, SuBeen Lee, Jae-Pil Heo
cs.CV updates on arXiv.org arxiv.org
Abstract: Video Temporal Grounding is to identify specific moments or highlights from a video corresponding to textual descriptions. Typical approaches in temporal grounding treat all video clips equally during the encoding process regardless of their semantic relevance with the text query. Therefore, we propose Correlation-Guided DEtection TRansformer(CG-DETR), exploring to provide clues for query-associated video clips within the cross-modal attention. First, we design an adaptive cross-attention with dummy tokens. Dummy tokens conditioned by text query take portions …
abstract arxiv correlation cs.cv detection encoding highlights identify moments process query representation representation learning semantic temporal text textual type video
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US