My benchmark for large language models | allainews.com

Feb. 19, 2024, midnight |

Nicholas Carlini nicholas.carlini.com

A benchmark of ~100 tests for language models, collected from actual questions I've asked of language models in the last year.

benchmark language language models large language large language models questions tests

More from nicholas.carlini.com / Nicholas Carlini

(yet another) Broken Adversarial Example Defense at IEEE S&P 2024 3 weeks, 6 days ago | nicholas.carlini.com

adversarial attacks computer computer security +10

My benchmark for large language models 3 months, 2 weeks ago | nicholas.carlini.com

benchmark language language models large language +3

My research idea logfile, 2016-2019 4 months, 1 week ago | nicholas.carlini.com

file ideas neurips process +3

Reading Data off an Apple ProFile Hard Drive with an Arduino 5 months, 4 weeks ago | nicholas.carlini.com

apple arduino data drive +2

Playing chess with large language models 8 months, 1 week ago | nicholas.carlini.com

bot building chess gpt +11

Little Bobby <|endoftext|> 9 months, 4 weeks ago | nicholas.carlini.com

A ChatGPT clone, in 3000 bytes of C, backed by GPT-2 1 year, 2 months ago | nicholas.carlini.com

chat chat gpt chatgpt encoding +6

Reflecting on XX(ldquo)YYTowards Evaluating the Robustness of Neural NetworksXX(rdquo)YY 1 year, 9 months ago | nicholas.carlini.com

Rapid Iteration in Machine Learning Research 1 year, 11 months ago | nicholas.carlini.com

iteration learning machine machine learning +1

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Principal Data Architect - Azure & Big Data

@ MGM Resorts International | Home Office - US, NV

View on ai-jobs.net

GN SONG MT Market Research Data Analyst 11

@ Accenture | Bengaluru, BDC7A

View on ai-jobs.net