all AI news
MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound. (arXiv:2201.02639v4 [cs.CV] UPDATED)
May 16, 2022, 1:11 a.m. | Rowan Zellers, Jiasen Lu, Ximing Lu, Youngjae Yu, Yanpeng Zhao, Mohammadreza Salehi, Aditya Kusupati, Jack Hessel, Ali Farhadi, Yejin Choi
cs.LG updates on arXiv.org arxiv.org
As humans, we navigate a multimodal world, building a holistic understanding
from all our senses. We introduce MERLOT Reserve, a model that represents
videos jointly over time -- through a new training objective that learns from
audio, subtitles, and video frames. Given a video, we replace snippets of text
and audio with a MASK token; the model learns by choosing the correct
masked-out snippet. Our objective learns faster than alternatives, and performs
well at scale: we pretrain on 20 million …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Data Architect
@ Western Digital | San Jose, CA, United States
Senior Data Scientist GenAI (m/w/d)
@ Deutsche Telekom | Bonn, Deutschland
Senior Data Engineer, Telco (Remote)
@ Lightci | Toronto, Ontario
Consultant Data Architect/Engineer H/F - Innovative Tech
@ Devoteam | Lyon, France
(Senior) ML Engineer / Software Engineer Machine Learning & AI (m/f/x) onsite or remote (in Germany or Austria)
@ Scalable GmbH | Wien, Germany