all AI news
MR-GSM8K: A Meta-Reasoning Revolution in Large Language Model Evaluation
Feb. 7, 2024, 5:48 a.m. | Zhongshen Zeng Pengguang Chen Shu Liu Haiyun Jiang Jiaya Jia
cs.CL updates on arXiv.org arxiv.org
agents benchmarks capabilities challenges cognitive cs.cl evaluation focus language language model language models large language large language model large language models math meta novel paradigm problem-solving reasoning them work
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne