all AI news
BEVT: BERT Pretraining of Video Transformers. (arXiv:2112.01529v2 [cs.CV] UPDATED)
Jan. 24, 2022, 2:11 a.m. | Rui Wang, Dongdong Chen, Zuxuan Wu, Yinpeng Chen, Xiyang Dai, Mengchen Liu, Yu-Gang Jiang, Luowei Zhou, Lu Yuan
cs.LG updates on arXiv.org arxiv.org
This paper studies the BERT pretraining of video transformers. It is a
straightforward but worth-studying extension given the recent success from BERT
pretraining of image transformers. We introduce BEVT which decouples video
representation learning into spatial representation learning and temporal
dynamics learning. In particular, BEVT first performs masked image modeling on
image data, and then conducts masked image modeling jointly with masked video
modeling on video data. This design is motivated by two observations: 1)
transformers learned on image datasets …
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Data Architect
@ Western Digital | San Jose, CA, United States
Senior Data Scientist GenAI (m/w/d)
@ Deutsche Telekom | Bonn, Deutschland
Senior Data Engineer, Telco (Remote)
@ Lightci | Toronto, Ontario
Consultant Data Architect/Engineer H/F - Innovative Tech
@ Devoteam | Lyon, France
(Senior) ML Engineer / Software Engineer Machine Learning & AI (m/f/x) onsite or remote (in Germany or Austria)
@ Scalable GmbH | Wien, Germany