AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models. (arXiv:2102.05126v3 [cs.CL] UPDATED) | allainews.com

Jan. 17, 2022, 2:10 a.m. | Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek

cs.LG updates on arXiv.org arxiv.org

Attention-based pre-trained language models such as GPT-2 brought
considerable progress to end-to-end dialogue modelling. However, they also
present considerable risks for task-oriented dialogue, such as lack of
knowledge grounding or diversity. To address these issues, we introduce
modified training objectives for language model finetuning, and we employ
massive data augmentation via back-translation to increase the diversity of the
training data. We further examine the possibilities of combining data from
multiples sources to improve performance on the target dataset. We carefully …

arxiv augmentation data language language models

More from arxiv.org / cs.LG updates on arXiv.org

PPNet: A Two-Stage Neural Network for End-to-end Path Planning 12 hours ago | arxiv.org

abstract arxiv cs.ai cs.lg +14

Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections 12 hours ago | arxiv.org

abstract arxiv cs.ai cs.dc +16

From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks 12 hours ago | arxiv.org

abstract architecture arxiv context +23

DGR: Tackling Drifted and Correlated Noise in Quantum Error Correction via Decoding Graph Re-weighting 12 hours ago | arxiv.org

abstract applications arxiv cs.ar +18

A Single-Loop Algorithm for Decentralized Bilevel Optimization 12 hours ago | arxiv.org

abstract algorithm applications arxiv +13

Watch Out! Simple Horizontal Class Backdoors Can Trivially Evade Defenses 12 hours ago | arxiv.org

abstract arxiv attacks backdoor +13

Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples 12 hours ago | arxiv.org

abstract alpha arxiv cs.cr +16

CLEANing Cygnus A deep and fast with R2D2 12 hours ago | arxiv.org

abstract arxiv astronomy astro-ph.im +17

Feature Imitating Networks Enhance The Performance, Reliability And Speed Of Deep Learning On Biomedical Image … 12 hours ago | arxiv.org

abstract arxiv biomedical cs.cv +21

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Engineer - Data Science Operations

@ causaLens | London - Hybrid, England, United Kingdom

View on ai-jobs.net

F0138 - LLM Developer (AI NLP)

@ Ubiquiti Inc. | Taipei

View on ai-jobs.net

Staff Engineer, Database

@ Nagarro | Gurugram, India

View on ai-jobs.net

Artificial Intelligence Assurance Analyst

@ Booz Allen Hamilton | USA, VA, McLean (8251 Greensboro Dr)

View on ai-jobs.net