Oct. 5, 2023, 4:02 p.m. | Nieves Crasto

Towards AI - Medium pub.towardsai.net

Notes on CLIP: Connecting Text and Images

Radford, Alec, et al. “Learning transferable visual models from natural language supervision.” International conference on machine learning. PMLR, 2021.

The authors of the above paper aim to produce good representations (features) for images that can be used for various tasks with minimal or no supervision.

Limitations with supervised learning

Off-the-shelf features generated by image classification models have been used in other tasks like image retrieval. However, these features do not …

aim authors clip conference contrastive-learning features good images international language limitations machine machine learning natural natural language notes paper prompt-engineering representation learning supervision tasks text transformers zero shot learning

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Robotics Technician - 3rd Shift

@ GXO Logistics | Perris, CA, US, 92571