all AI news
Sorry "this beats GPT-4" - A new kind of LLM RANKINGS!!!
Nov. 28, 2023, 8:54 p.m. | 1littlecoder
1littlecoder www.youtube.com
Chatbot Arena - a crowdsourced, randomized battle platform. We use 100K+ user votes to compute Elo ratings.
MT-Bench - a set of challenging multi-turn questions. We use GPT-4 to grade the model responses.
MMLU (5-shot) - a test to measure a model's multitask accuracy on 57 tasks.
🔗 Links 🔗
ChatBOT Arena Leaderboard from Lmsys - https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard
Arena Leaderboard Elo Ranking Method - https://colab.research.google.com/drive/1RAWb22-PFNI-X1gPVzc927SGUdfr6nsR?usp=sharing
Play at the Arena - https://chat.lmsys.org/?arena …
arena benchmarks chatbot compute elo gpt gpt-4 kind leaderboard llm mmlu platform questions rankings responses set test
More from www.youtube.com / 1littlecoder
This Freaky AI Turns Your Thoughts Into Words
1 day, 2 hours ago |
www.youtube.com
I Let My AGENT Loose (AI Town World Editor)
1 day, 6 hours ago |
www.youtube.com
ALMOST a step closer to HER!! (ChatGPT Memory Tutorial)
2 days, 6 hours ago |
www.youtube.com
Is it a NEW OpenAI MODEL? (Testing gpt2-chatbot)
3 days, 1 hour ago |
www.youtube.com
100% Local "AI Town" with Llama 3 AGENTS!!!
4 days, 3 hours ago |
www.youtube.com
WEIRD AI News (An Honest Take!)
6 days, 7 hours ago |
www.youtube.com
How-To Run Llama 3 LOCALLY with RAG!!! (GPT4ALL Tutorial)
1 week, 1 day ago |
www.youtube.com
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Business Data Analyst
@ Alstom | Johannesburg, GT, ZA