all AI news
Bear the Query in Mind: Visual Grounding with Query-conditioned Convolution. (arXiv:2206.09114v2 [cs.CV] UPDATED)
Web: http://arxiv.org/abs/2206.09114
June 23, 2022, 1:11 a.m. | Chonghan Chen, Qi Jiang, Chih-Hao Wang, Noel Chen, Haohan Wang, Xiang Li, Bhiksha Raj
cs.LG updates on arXiv.org arxiv.org
Visual grounding is a task that aims to locate a target object according to a
natural language expression. As a multi-modal task, feature interaction between
textual and visual inputs is vital. However, previous solutions mainly handle
each modality independently before fusing them together, which does not take
full advantage of relevant textual information while extracting visual
features. To better leverage the textual-visual relationship in visual
grounding, we propose a Query-conditioned Convolution Module (QCM) that
extracts query-aware visual features by incorporating …
More from arxiv.org / cs.LG updates on arXiv.org
Latest AI/ML/Big Data Jobs
Machine Learning Researcher - Saalfeld Lab
@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia
Project Director, Machine Learning in US Health
@ ideas42.org | Remote, US
Data Science Intern
@ NannyML | Remote
Machine Learning Engineer NLP/Speech
@ Play.ht | Remote
Research Scientist, 3D Reconstruction
@ Yembo | Remote, US
Clinical Assistant or Associate Professor of Management Science and Systems
@ University at Buffalo | Buffalo, NY