June 4, 2024, 4:54 a.m. | Wonkee Lee, Seong-Hwan Heo, Jong-Hyeok Lee

cs.CL updates on arXiv.org arxiv.org

arXiv:2204.03896v2 Announce Type: replace
Abstract: Semi-supervised learning that leverages synthetic data for training has been widely adopted for developing automatic post-editing (APE) models due to the lack of training data. With this aim, we focus on data-synthesis methods to create high-quality synthetic data. Given that APE takes as input a machine-translation result that might include errors, we present a data-synthesis method by which the resulting synthetic data mimic the translation errors found in actual data. We introduce a noising-based data-synthesis …

abstract aim ape arxiv create cs.cl data data-synthesis editing focus quality replace semi semi-supervised semi-supervised learning supervised learning synthesis synthetic synthetic data terms training training data type

Senior Data Engineer

@ Displate | Warsaw

Junior Data Analyst - ESG Data

@ Institutional Shareholder Services | Mumbai

Intern Data Driven Development in Sensor Fusion for Autonomous Driving (f/m/x)

@ BMW Group | Munich, DE

Senior MLOps Engineer, Machine Learning Platform

@ GetYourGuide | Berlin

Data Engineer, Analytics

@ Meta | Menlo Park, CA

Data Engineer

@ Meta | Menlo Park, CA