Boosting Video-Text Retrieval with Explicit High-Level Semantics. (arXiv:2208.04215v2 [cs.CV] UPDATED) | allainews.com

Aug. 10, 2022, 1:12 a.m. | Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding

cs.CV updates on arXiv.org arxiv.org

Video-text retrieval (VTR) is an attractive yet challenging task for
multi-modal understanding, which aims to search for relevant video (text) given
a query (video). Existing methods typically employ completely heterogeneous
visual-textual information to align video and text, whilst lacking the
awareness of homogeneous high-level semantic information residing in both
modalities. To fill this gap, in this work, we propose a novel
visual-linguistic aligning model named HiSE for VTR, which improves the
cross-modal representation by incorporating explicit high-level semantics.
First, we …

arxiv boosting cv retrieval semantics text video

More from arxiv.org / cs.CV updates on arXiv.org

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception 16 hours ago | arxiv.org

agent arxiv autonomous cs.cl +8

Low-resolution Prior Equilibrium Network for CT Reconstruction 16 hours ago | arxiv.org

abstract arxiv cs.cv deep learning +17

MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images 16 hours ago | arxiv.org

abstract artifact arxiv cs.cv +16

Back to Basics: Fast Denoising Iterative Algorithm 16 hours ago | arxiv.org

abstract algorithm arxiv basics +10

Predicting Thrombectomy Recanalization from CT Imaging Using Deep Learning Models 16 hours ago | arxiv.org

abstract arxiv benefit clinicians +10

Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models 16 hours ago | arxiv.org

abstract adversarial adversarial examples art +20

Methods and strategies for improving the novel view synthesis quality of neural radiation field 16 hours ago | arxiv.org

abstract application arxiv attention +16

AffordanceLLM: Grounding Affordance from Vision Language Models 16 hours ago | arxiv.org

arxiv cs.cv cs.ro language +3

DualFluidNet: an Attention-based Dual-pipeline Network for FLuid Simulation 16 hours ago | arxiv.org

arxiv attention cs.cv cs.gr +4

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Applied Scientist, Control Stack, AWS Center for Quantum Computing

@ Amazon.com | Pasadena, California, USA

View on ai-jobs.net

Specialist Marketing with focus on ADAS/AD f/m/d

@ AVL | Graz, AT

View on ai-jobs.net

Machine Learning Engineer, PhD Intern

@ Instacart | United States - Remote

View on ai-jobs.net

Supervisor, Breast Imaging, Prostate Center, Ultrasound

@ University Health Network | Toronto, ON, Canada

View on ai-jobs.net

Senior Manager of Data Science (Recommendation Science)

@ NBCUniversal | New York, NEW YORK, United States

View on ai-jobs.net