Jan. 21, 2022, 5:57 a.m. | Matan Weksler

Towards AI - Medium pub.towardsai.net

Artificial Intelligence

Last month DeepMind published their new NLP model called RETRO (Retrieval-Enhanced TRansfOrmer) which according to the paper, is a leap forward in the NLP world in multiple aspects. A notable one is that while this model achieved comparable results to SOTA architecture (e.g., GPT-3) it’s X25 times smaller with only 7.5B parameters compared to the 178B parameters of AI21 Jurassic-1.

This breaks the presumption that bigger models mean better accuracy.

The main advantage of smaller models …

deep learning deepmind nlp retrieval transformers

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Business Intelligence Developer / Analyst

@ Transamerica | Work From Home, USA

Data Analyst (All Levels)

@ Noblis | Bethesda, MD, United States