Unifying image-caption and image-classification datasets with prefix conditioning | allainews.com

June 27, 2023, 9:19 p.m. | Google AI (noreply@blogger.com)

Google AI Blog ai.googleblog.com

Posted by Kuniaki Saito, Student Researcher, Cloud AI Team, and Kihyuk Sohn, Research Scientist, Perception Team

Pre-training visual language (VL) models on web-scale image-caption datasets has recently emerged as a powerful alternative to traditional pre-training on image classification data. Image-caption datasets are considered to be more “open-domain” because they contain broader scene types and vocabulary words, which result in models with strong performance in few- and zero-shot recognition tasks. However, images with fine-grained class descriptions can be rare, and …

classification cloud cloud ai computer vision cvpr data datasets image image-classification language multimodal learning natural language processing perception pre-training research researcher research scientist scale team training web

More from ai.googleblog.com / Google AI Blog

Generative AI to quantify uncertainty in weather forecasting 1 month ago | ai.googleblog.com

climate decisions engineer example +17

AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks 1 month ago | ai.googleblog.com

bayesian data economic engineer +23

Computer-aided diagnosis for lung cancer screening 1 month, 1 week ago | ai.googleblog.com

cancer cancer screening computer diagnosis +16

Using AI to expand global access to reliable flood forecasts 1 month, 1 week ago | ai.googleblog.com

billion disaster engineering environment +13

ScreenAI: A visual language model for UI and visually-situated language understanding 1 month, 1 week ago | ai.googleblog.com

charts communication design diagrams +24

SCIN: A new resource for representative dermatology images 1 month, 2 weeks ago | ai.googleblog.com

crowd-sourcing dataset datasets dermatology +14

MELON: Reconstructing 3D objects from images with unknown poses 1 month, 2 weeks ago | ai.googleblog.com

3d objects capacity computer vision engineer +16

HEAL: A framework for health equity assessment of machine learning performance 1 month, 2 weeks ago | ai.googleblog.com

assessment clinical core differences +17

Cappy: Outperforming and boosting large multi-task language models with a small scorer 1 month, 2 weeks ago | ai.googleblog.com

boosting engineers framework google +25

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Machine Learning Engineer

@ Samsara | Canada - Remote

View on ai-jobs.net