all AI news
The Best Performing Instruct LLMs (open-source)
June 16, 2023, noon | code_your_own_AI
code_your_own_AI www.youtube.com
Latest Arxiv pre-print on Evaluation of LLMs:
"INSTRUCTEVAL: Towards Holistic Evaluation of
Instruction-Tuned Large Language Models"
https://arxiv.org/pdf/2306.04757.pdf
3 other leaderboards from Stanford, HuggingFace and LMsys:
----------------------------------------------------------------------------------------------------
HuggingFace leaderboard:
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
LMsys leaderboard:
https://chat.lmsys.org/?leaderboard
HELM:
https://crfm.stanford.edu/helm/latest/?
#benchmark
#chatgpt
#gpt4
#largelanguagemodels
arxiv benchmark chatgpt evaluation gpt4 helm huggingface language language models large language large language models largelanguagemodels llms stanford
More from www.youtube.com / code_your_own_AI
Stealth LLM: im-a-good-gpt2-chatbot
1 day, 4 hours ago |
www.youtube.com
Understand DSPy: Programming AI Pipelines
3 days, 4 hours ago |
www.youtube.com
Latest Insights in AI Performance Models
5 days, 4 hours ago |
www.youtube.com
Multi-Token Prediction (forget next token LLM?)
1 week, 1 day ago |
www.youtube.com
NEW LLM Test: Reasoning & gpt2-chatbot
1 week, 2 days ago |
www.youtube.com
LLMs: Rewriting Our Tomorrow (plus code) #ai
1 week, 3 days ago |
www.youtube.com
Autonomous AI Agents: 14 % MAX Performance
1 week, 5 days ago |
www.youtube.com
Jobs in AI, ML, Big Data
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US