April 4, 2024, 4:42 a.m. | Sherin Muckatira, Vijeta Deshpande, Vladislav Lialin, Anna Rumshisky

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.02204v1 Announce Type: cross
Abstract: Large language models can solve new tasks without task-specific fine-tuning. This ability, also known as in-context learning (ICL), is considered an emergent ability and is primarily seen in large language models with billions of parameters. This study investigates if such emergent properties are strictly tied to model size or can be demonstrated by smaller models trained on reduced-scale data. To explore this, we simplify pre-training data and pre-train 36 causal language models with parameters varying …

abstract arxiv context cs.cl cs.lg fine-tuning generative in-context learning language language models large language large language models parameters scale solve study tasks type

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US