Aug. 30, 2022, 1:13 a.m. | Qingyu Zhang, Xiaoyu Shen, Ernie Chang, Jidong Ge, Pengke Chen

cs.CL updates on arXiv.org arxiv.org

Owing to the lack of corpora for low-resource languages, current works on
dialogue generation have mainly focused on English. In this paper, we present
mDIA, the first large-scale multilingual benchmark for dialogue generation
across low- to high-resource languages. It covers real-life conversations in 46
languages across 19 language families. We present baseline results obtained by
fine-tuning the multilingual, non-dialogue-focused pre-trained model mT5 as
well as English-centric, dialogue-focused pre-trained chatbot DialoGPT. The
results show that mT5-based models perform better on sacreBLEU …

arxiv benchmark generation

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US