May 19, 2022, 2:49 p.m. | /u/Emergency_Apricot_77

Machine Learning www.reddit.com

A lot of recent progress in AI was made on proprietary datasets e.g. ViT used JFT300M, both DALL-E/2 papers used proprietary text-image datasets. While the results are truly exceptional, a part of me keeps bothering me about the "proprietary" nature of the datasets which sometimes makes me question the actual robustness of these models. Every now and then I will have following questions about these models :

1. For pure image models (say ViT), are we sure that the proprietary …

datasets machinelearning

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Engineer - Data Science Operations

@ causaLens | London - Hybrid, England, United Kingdom

F0138 - LLM Developer (AI NLP)

@ Ubiquiti Inc. | Taipei

Staff Engineer, Database

@ Nagarro | Gurugram, India

Artificial Intelligence Assurance Analyst

@ Booz Allen Hamilton | USA, VA, McLean (8251 Greensboro Dr)