April 9, 2024, 4:42 a.m. | Mohamed El Amine Seddik, Suei-Wen Chen, Soufiane Hayou, Pierre Youssef, Merouane Debbah

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.05090v1 Announce Type: new
Abstract: The phenomenon of model collapse, introduced in (Shumailov et al., 2023), refers to the deterioration in performance that occurs when new models are trained on synthetic data generated from previously trained models. This recursive training loop makes the tails of the original distribution disappear, thereby making future-generation models forget about the initial (real) distribution. With the aim of rigorously understanding model collapse in language models, we consider in this paper a statistical model that allows …

abstract analysis arxiv cs.ai cs.cl cs.lg data generated language language model loop model collapse performance recursive statistical synthetic synthetic data training type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Engineer - AWS

@ 3Pillar Global | Costa Rica

Cost Controller/ Data Analyst - India

@ John Cockerill | Mumbai, India, India, India