all AI news
Object-aware Video-language Pre-training for Retrieval. (arXiv:2112.00656v6 [cs.CV] UPDATED)
May 19, 2022, 1:11 a.m. | Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou
cs.CL updates on arXiv.org arxiv.org
Recently, by introducing large-scale dataset and strong transformer network,
video-language pre-training has shown great success especially for retrieval.
Yet, existing video-language transformer models do not explicitly fine-grained
semantic align. In this work, we present Object-aware Transformers, an
object-centric approach that extends video-language transformer to incorporate
object representations. The key idea is to leverage the bounding boxes and
object tags to guide the training process. We evaluate our model on three
standard sub-tasks of video-text matching on four widely used benchmarks. …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analyst
@ Aviva | UK - Norwich - Carrara - 1st Floor
Werkstudent im Bereich Performance Engineering mit Computer Vision (w/m/div.) - anteilig remote
@ Bosch Group | Stuttgart, Lollar, Germany
Applied Research Scientist - NLP (Senior)
@ Snorkel AI | Hybrid / San Francisco, CA
Associate Principal Engineer, Machine Learning
@ Nagarro | Remote, India