all AI news
[R] No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
April 9, 2024, 4:27 a.m. | /u/quequero
Machine Learning www.reddit.com
>Web-crawled pretraining datasets underlie the impressive "zero-shot" evaluation performance of multimodal models, such as CLIP for classification/retrieval and Stable-Diffusion for image generation. However, it is unclear how meaningful the notion of "zero-shot" generalization is for such multimodal models, as it is not known to what extent their pretraining datasets encompass the downstream concepts targeted for during "zero-shot" evaluation. In this work, we ask: How is the performance of multimodal models on downstream concepts influenced by the frequency of these …
abstract classification clip concept data datasets diffusion evaluation however image image generation machinelearning multimodal multimodal model multimodal models notion performance pretraining retrieval web zero-shot
More from www.reddit.com / Machine Learning
[D] Get paid for peer reviews on ResearchHub
1 day, 5 hours ago |
www.reddit.com
[D] NER for large text data
1 day, 5 hours ago |
www.reddit.com
[P] Table Extraction , Text Extraction
1 day, 6 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead Data Engineer
@ WorkMoney | New York City, United States - Remote