March 29, 2022, 2:20 p.m. | Synced

Synced syncedreview.com

In the new paper Token Dropping for Efficient BERT Pretraining, a research team from Google, New York University, and the University of Maryland proposes a simple but effective “token dropping” technique that significantly reduces the pretraining cost of transformer models such as BERT without hurting performance on downstream fine-tuning tasks.


The post Google, NYU & Maryland U’s Token-Dropping Approach Reduces BERT Pretraining Time by 25% first appeared on Synced.

ai artificial intelligence bert google machine learning machine learning & data science ml natural language processing nature language tech nyu pretrained language model research technology time transformers

More from syncedreview.com / Synced

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)

@ takealot.com | Cape Town