April 17, 2023, 8:22 p.m. | Yiqun Yao, Yequan Wang

cs.CL updates on arXiv.org arxiv.org

As language models scale up, it becomes increasingly expensive to verify
research ideas because conclusions on small models do not trivially transfer to
large ones. A possible solution is to establish a generic system that directly
predicts some metrics for large models solely based on the results and
hyperparameters from small models. Existing methods based on scaling laws
require hyperparameter search on the largest models, which is impractical with
limited resources. We address this issue by presenting our discoveries
indicating …

arxiv discoveries hyperparameter ideas language language models large models laws loss metrics prediction presenting research resources scale scaling search small solution transfer verify

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

AI Scientist/Engineer

@ OKX | Singapore

Research Engineering/ Scientist Associate I

@ The University of Texas at Austin | AUSTIN, TX

Senior Data Engineer

@ Algolia | London, England

Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)

@ BlackRock | NY7 - 50 Hudson Yards, New York

Snowflake Data Analytics

@ Devoteam | Madrid, Spain