all AI news
Fact or Fiction? NOCHA: A New Benchmark for Evaluating Long-Context Reasoning in LLMs
June 28, 2024, 6:45 a.m. | /u/ai-lover
machinelearningnews www.reddit.com
The NOCHA methodology involves collecting narrative minimal pairs from recently published fictional books. Annotators familiar with these books generate pairs of …
allen allen institute allen institute for ai annotation benchmark claim context evaluation fiction human institute language language models llms machinelearningnews methodology narrative performance princeton university reasoning researchers university
More from www.reddit.com / machinelearningnews
Jobs in AI, ML, Big Data
Junior Senior Reliability Engineer
@ NielsenIQ | Bogotá, Colombia
[Job - 15712] Vaga Afirmativa para Mulheres - QA (Automation), SR
@ CI&T | Brazil
Production Reliability Engineer, Trade Desk
@ Jump Trading | Sydney, Australia
Senior Process Engineer, Prenatal
@ BillionToOne | Union City and Menlo Park, CA
Senior Scientist, Sustainability Science and Innovation
@ Microsoft | Redmond, Washington, United States
Data Scientist
@ Ford Motor Company | Chennai, Tamil Nadu, India