Aug. 24, 2023, 4 a.m. | /u/AIsupercharged

Machine Learning www.reddit.com

The Allen Institute for AI (AI2) has released Dolma, a new, huge text dataset that is free to use and open to inspection. This dataset is intended to be the opposite of the closely guarded datasets used by companies like OpenAI and Meta to train their language models. AI2 aims to reverse this trend and make the data used to create language models available to the AI research community.

If you want to stay on top of the latest trends …

ai2 allen allen institute allen institute for ai companies dataset datasets dolma free institute language language models machinelearning meta openai releases text training

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Reporting & Data Analytics Lead (Sizewell C)

@ EDF | London, GB

Data Analyst

@ Notable | San Mateo, CA