March 20, 2024, 4:43 a.m. | Pratyush Maini, Sachin Goyal, Zachary C. Lipton, J. Zico Kolter, Aditi Raghunathan

cs.LG updates on arXiv.org arxiv.org

arXiv:2307.03132v2 Announce Type: replace-cross
Abstract: Large web-sourced multimodal datasets have powered a slew of new methods for learning general-purpose visual representations, advancing the state of the art in computer vision and revolutionizing zero- and few-shot recognition. One crucial decision facing practitioners is how, if at all, to curate these ever-larger datasets. For example, the creators of the LAION-5B dataset chose to retain only image-caption pairs whose CLIP similarity score exceeded a designated threshold. In this paper, we propose a new …

arxiv cs.cl cs.cv cs.lg feature mars text type visual

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Developer AI Senior Staff Engineer, Machine Learning

@ Google | Sunnyvale, CA, USA; New York City, USA

Engineer* Cloud & Data Operations (f/m/d)

@ SICK Sensor Intelligence | Waldkirch (bei Freiburg), DE, 79183