Web: http://arxiv.org/abs/2201.12109

Jan. 31, 2022, 2:10 a.m. | Pan He, Yuxi Chen, Yan Wang, Yanru Zhang

cs.CL updates on arXiv.org arxiv.org

Recently, prompt tuning \cite{lester2021power} has gradually become a new
paradigm for NLP, which only depends on the representation of the words by
freezing the parameters of pre-trained language models (PLMs) to obtain
remarkable performance on downstream tasks. It maintains the consistency of
Masked Language Model (MLM) \cite{devlin2018bert} task in the process of
pre-training, and avoids some issues that may happened during fine-tuning.
Naturally, we consider that the "[MASK]" tokens carry more useful information
than other tokens because the model combines …


More from arxiv.org / cs.CL updates on arXiv.org

Senior Data Engineer

@ DAZN | Hammersmith, London, United Kingdom

Sr. Data Engineer, Growth

@ Netflix | Remote, United States

Data Engineer - Remote

@ Craft | Wrocław, Lower Silesian Voivodeship, Poland

Manager, Operations Data Science

@ Binance.US | Vancouver

Senior Machine Learning Researcher for Copilot

@ GitHub | Remote - Europe

Sr. Marketing Data Analyst

@ HoneyBook | San Francisco, CA