June 10, 2022, 4 p.m. | /u/No_Coffee_4638

machinelearningnews www.reddit.com

Sequence-to-sequence (seq2seq) models that have already been trained, like BART and T5, have done very well in various natural language processing tasks, like text summarization, machine translation, answering questions, and extracting information. But these large-scale language models that have already been trained have hundreds of millions of parameters—work done at AWS AI Labs during an internship. Equal contribution trained a BART model with 400 million parameters, while T5 pushed the limit to 11 billion parameters.

👉 Empirical results show that, …

ai amazon bart compression machinelearningnews researchers

More from www.reddit.com / machinelearningnews

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne