Feb. 10, 2024, 6:14 p.m. | 1littlecoder

1littlecoder www.youtube.com

🔗 Links 🔗

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

https://arxiv.org/pdf/2402.01781.pdf


❤️ If you want to support the channel ❤️
Support here:
Patreon - https://www.patreon.com/1littlecoder/
Ko-Fi - https://ko-fi.com/1littlecoder

🧭 Follow me on 🧭
Twitter - https://twitter.com/1littlecoder
Linkedin - https://www.linkedin.com/in/amrrs/

benchmarks everything language language model large language large language model llm llm benchmarks mmlu sensitivity support targets

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US