all AI news
[N] Vision-Language Pre-training: Basics, Recent Advances, and Future Trends - Microsoft 2022 - 102 Pages!
Nov. 12, 2022, 11:55 p.m. | /u/Singularian2501
Machine Learning www.reddit.com
Abstract:
>This paper surveys vision-language pre-training (VLP) methods for **multimodal intelligence** that have been developed in the last few years. We group these approaches into three categories: (*i*) VLP for image-text tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding; (*ii*) VLP for core computer vision tasks, such as (open-set) image classification, object detection, and segmentation; and (*iii*) VLP for video-text tasks, such as video captioning, video-text retrieval, and video question answering. For each …
basics future language machinelearning microsoft pre-training training trends vision
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Scientist (m/f/x/d)
@ Symanto Research GmbH & Co. KG | Spain, Germany
Automated Greenhouse Expert - Phenotyping & Data Analysis (all genders)
@ Bayer | Frankfurt a.M., Hessen, DE
Machine Learning Scientist II
@ Expedia Group | India - Bengaluru
Data Engineer/Senior Data Engineer, Bioinformatics
@ Flagship Pioneering, Inc. | Cambridge, MA USA
Intern (AI lab)
@ UL Solutions | Dublin, Co. Dublin, Ireland
Senior Operations Research Analyst / Predictive Modeler
@ LinQuest | Colorado Springs, Colorado, United States