Web: https://www.reddit.com/r/MachineLearning/comments/uj06uc/r_scaled_up_cliplike_model_2b_shows_86_zeroshot/

May 5, 2022, 3:38 p.m. | /u/Competitive-Rub-1958

Machine Learning reddit.com

Paper: [https://arxiv.org/pdf/2205.01917.pdf](https://arxiv.org/pdf/2205.01917.pdf)


[Impressive performance on diverse datasets may indicate higher generalizability :\) \[Without \\"task-specific\\" customizations\]](https://preview.redd.it/e7z9vn93gox81.png?width=649&format=png&auto=webp&s=7c2ef9afbb6608b6ebd3ba164d107adb8e7dcbd8)

Confirms that multi-modal models can scale further from single-digit Billion params (who would've thought) and scales up an simple CLIP-like model showing substantial improvements - especially in 0-shot domain. Simple Contrastive learning appears more and more promising for multi-modal objectives...

clip imagenet machinelearning model on shows

Data Analyst, Patagonia Action Works

@ Patagonia | Remote

Data & Insights Strategy & Innovation General Manager

@ Chevron Services Company, a division of Chevron U.S.A Inc. | Houston, TX

Faculty members in Research areas such as Bayesian and Spatial Statistics; Data Privacy and Security; AI/ML; NLP; Image and Video Data Analysis

@ Ahmedabad University | Ahmedabad, India

Director, Applied Mathematics & Computational Research Division

@ Lawrence Berkeley National Lab | Berkeley, Ca

Business Data Analyst

@ MainStreet Family Care | Birmingham, AL

Assistant/Associate Professor of the Practice in Business Analytics

@ Georgetown University McDonough School of Business | Washington DC