Sept. 12, 2022, 1:14 a.m. | Xiaofei Sun, Yufei Tian, Yuxian Meng, Nanyun Peng, Fei Wu, Jiwei Li, Chun Fan

cs.CL updates on arXiv.org arxiv.org

In this paper, we propose a new paradigm for paraphrase generation by
treating the task as unsupervised machine translation (UMT) based on the
assumption that there must be pairs of sentences expressing the same meaning in
a large-scale unlabeled monolingual corpus. The proposed paradigm first splits
a large unlabeled corpus into multiple clusters, and trains multiple UMT models
using pairs of these clusters. Then based on the paraphrase pairs produced by
these UMT models, a unified surrogate model can be …

arxiv machine machine translation translation unsupervised

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior AI & Data Engineer

@ Bertelsmann | Kuala Lumpur, 14, MY, 50400

Analytics Engineer

@ Reverse Tech | Philippines - Remote