all AI news
[D] Confusion about masking in BERT model
Jan. 27, 2022, 7:51 p.m. | /u/mrtac96
Machine Learning www.reddit.com
I am trying to understand the masking in BERT model.
I have confusion in following line taken from paper
The training data generator chooses 15% of the token positions at random for prediction. If the i-th token is chosen, we replace the i-th token with (1) the [MASK] token 80% of the time (2) a random token 10% of the time (3) the unchanged i-th token 10% of the time
at point 3 it say unchanged token (i think it …
!-->More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
(373) Applications Manager – Business Intelligence - BSTD
@ South African Reserve Bank | South Africa
Data Engineer Talend (confirmé/sénior) - H/F - CDI
@ Talan | Paris, France
Data Science Intern (Summer) / Stagiaire en données (été)
@ BetterSleep | Montreal, Quebec, Canada
Director - Master Data Management (REMOTE)
@ Wesco | Pittsburgh, PA, United States
Architect Systems BigData REF2649A
@ Deutsche Telekom IT Solutions | Budapest, Hungary
Data Product Coordinator
@ Nestlé | São Paulo, São Paulo, BR, 04730-000