Aug. 26, 2023, 3:31 p.m. | Sascha Kirch

Towards Data Science - Medium towardsdatascience.com

Paper Summary— Learning Transferable Visual Models From Natural Language Supervision

In this article we are going through the paper behind CLIP (Contrastive Language-Image Pre-Training). We will extract key concepts and break them down to make them easy to understand. Further, images and data graphs are annotated to clarify doubts.

Image Source
Paper: Learning Transferable Visual Models From Natural Language Supervision
Code: https://github.com/OpenAI/CLIP
First Published: 26 Feb. 2021
Authors: Alec Radford, Jong Wook Kim …

deep learning foundation-models multimodal paper-review unsupervised learning

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Research Scientist, Demography and Survey Science, University Grad

@ Meta | Menlo Park, CA | New York City

Computer Vision Engineer, XR

@ Meta | Burlingame, CA