Web: http://arxiv.org/abs/2201.11367

Jan. 28, 2022, 2:10 a.m. | Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Xin Wang, Jin Liu, Qun Liu, Xin Jiang

cs.CL updates on arXiv.org arxiv.org

Real human conversation data are complicated, heterogeneous, and noisy, from
whom building open-domain dialogue systems remains a challenging task. In fact,
such dialogue data can still contain a wealth of information and knowledge,
however, they are not fully explored. In this paper, we show existing
open-domain dialogue generation methods by memorizing context-response paired
data with causal or encode-decode language models underutilize the training
data. Different from current approaches, using external knowledge, we explore a
retrieval-generation training framework that can increase …

arxiv open retrieval training

More from arxiv.org / cs.CL updates on arXiv.org

Director, Data Engineering and Architecture

@ Chainalysis | California | New York | Washington DC | Remote - USA

Deep Learning Researcher

@ Topaz Labs | Dallas, TX

Sr Data Engineer (Contractor)

@ SADA | US - West

Senior Cloud Database Administrator

@ Findhelp | Remote

Senior Data Analyst

@ System1 | Remote

Speech Machine Learning Research Engineer

@ Samsung Research America | Mountain View, CA