Nov. 28, 2023, 8:54 p.m. | 1littlecoder

1littlecoder www.youtube.com

🏆 This leaderboard is based on the following three benchmarks.

Chatbot Arena - a crowdsourced, randomized battle platform. We use 100K+ user votes to compute Elo ratings.
MT-Bench - a set of challenging multi-turn questions. We use GPT-4 to grade the model responses.
MMLU (5-shot) - a test to measure a model's multitask accuracy on 57 tasks.

🔗 Links 🔗

ChatBOT Arena Leaderboard from Lmsys - https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard

Arena Leaderboard Elo Ranking Method - https://colab.research.google.com/drive/1RAWb22-PFNI-X1gPVzc927SGUdfr6nsR?usp=sharing

Play at the Arena - https://chat.lmsys.org/?arena …

arena benchmarks chatbot compute elo gpt gpt-4 kind leaderboard llm mmlu platform questions rankings responses set test

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA