Feb. 10, 2024, 6:14 p.m. | 1littlecoder

1littlecoder www.youtube.com

🔗 Links 🔗

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

https://arxiv.org/pdf/2402.01781.pdf


❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder

🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder
Linkedin - https://www.linkedin.com/in/amrrs/

benchmarks everything language language model large language large language model llm llm benchmarks mmlu sensitivity support targets

Research Scholar (Technical Research)

@ Centre for the Governance of AI | Hybrid; Oxford, UK

HPC Engineer (x/f/m) - DACH

@ Meshcapade GmbH | Remote, Germany

Business Intelligence Analyst Lead

@ Zillow | Mexico City

Lead Data Engineer

@ Bristol Myers Squibb | Hyderabad

Big Data Solutions Architect

@ Databricks | Munich, Germany

Senior Data Scientist - Trendyol Seller

@ Trendyol | Istanbul (All)