Feb. 10, 2024, 6:14 p.m. | 1littlecoder

1littlecoder www.youtube.com

🔗 Links 🔗

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

https://arxiv.org/pdf/2402.01781.pdf


❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder

🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder
Linkedin - https://www.linkedin.com/in/amrrs/

benchmarks everything language language model large language large language model llm llm benchmarks mmlu sensitivity support targets

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne