May 19, 2022, 1:11 a.m. | Aloka Fernando, Surangika Ranathunga

cs.CL updates on arXiv.org arxiv.org

Out-of-Vocabulary (OOV) is a problem for Neural Machine Translation (NMT).
OOV refers to words with a low occurrence in the training data, or to those
that are absent from the training data. To alleviate this, word or phrase-based
Data Augmentation (DA) techniques have been used. However, existing DA
techniques have addressed only one of these OOV types and limit to considering
either syntactic constraints or semantic constraints. We present a word and
phrase replacement-based DA technique that consider both types …

arxiv augmentation data machine machine translation neural machine translation translation

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Analyst (Commercial Excellence)

@ Allegro | Poznan, Warsaw, Poland

Senior Machine Learning Engineer

@ Motive | Pakistan - Remote

Summernaut Customer Facing Data Engineer

@ Celonis | Raleigh, US, North Carolina

Data Engineer Mumbai

@ Nielsen | Mumbai, India