My benchmark for large language models | allainews.com

Feb. 19, 2024, midnight |

Nicholas Carlini nicholas.carlini.com

A benchmark of ~100 tests for language models, collected from actual questions I've asked of language models in the last year.

benchmark language language models large language large language models questions tests

More from nicholas.carlini.com / Nicholas Carlini

My benchmark for large language models 2 months, 1 week ago | nicholas.carlini.com

benchmark language language models large language +3

My research idea logfile, 2016-2019 3 months ago | nicholas.carlini.com

file ideas neurips process +3

Reading Data off an Apple ProFile Hard Drive with an Arduino 4 months, 3 weeks ago | nicholas.carlini.com

apple arduino data drive +2

Playing chess with large language models 7 months ago | nicholas.carlini.com

bot building chess gpt +11

Little Bobby <|endoftext|> 8 months, 3 weeks ago | nicholas.carlini.com

A ChatGPT clone, in 3000 bytes of C, backed by GPT-2 1 year ago | nicholas.carlini.com

chat chat gpt chatgpt encoding +6

Reflecting on XX(ldquo)YYTowards Evaluating the Robustness of Neural NetworksXX(rdquo)YY 1 year, 8 months ago | nicholas.carlini.com

Rapid Iteration in Machine Learning Research 1 year, 10 months ago | nicholas.carlini.com

iteration learning machine machine learning +1

A Case of Plagarism in Machine Learning Research 2 years ago | nicholas.carlini.com

learning machine machine learning research

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior ML Engineer

@ Carousell Group | Ho Chi Minh City, Vietnam

View on ai-jobs.net

Data and Insight Analyst

@ Cotiviti | Remote, United States

View on ai-jobs.net