Feb. 16, 2024, 5:44 a.m. | Jiachuan Wang, Shimin Di, Lei Chen, Charles Wang Wai Ng

cs.LG updates on arXiv.org arxiv.org

arXiv:2312.11560v2 Announce Type: replace
Abstract: Recently, emergence has received widespread attention from the research community along with the success of large language models. Different from the literature, we hypothesize a key factor that highly promotes the performance during the increase of scale: the reduction of monosemantic neurons that can only form one-to-one correlations with specific features. Monosemantic neurons tend to be sparser and have negative impacts on the performance in large models. Inspired by this insight, we propose an intuitive …

abstract artificial artificial neural networks arxiv attention community cs.ai cs.lg cs.ne emergence key language language models large language large language models literature networks neural networks neurons performance research research community scale study success type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

MLOps Engineer - Hybrid Intelligence

@ Capgemini | Madrid, M, ES

Analista de Business Intelligence (Industry Insights)

@ NielsenIQ | Cotia, Brazil