all AI news
Reading Subtext: Evaluating Large Language Models on Short Story Summarization with Writers
March 5, 2024, 2:51 p.m. | Melanie Subbiah, Sean Zhang, Lydia B. Chilton, Kathleen McKeown
cs.CL updates on arXiv.org arxiv.org
Abstract: We evaluate recent Large language Models (LLMs) on the challenging task of summarizing short stories, which can be lengthy, and include nuanced subtext or scrambled timelines. Importantly, we work directly with authors to ensure that the stories have not been shared online (and therefore are unseen by the models), and to obtain informed evaluations of summary quality using judgments from the authors themselves. Through quantitative and qualitative analysis grounded in narrative theory, we compare GPT-4, …
abstract arxiv authors cs.cl language language models large language large language models llms reading stories story summarization summarizing type work writers
More from arxiv.org / cs.CL updates on arXiv.org
Benchmarking LLMs via Uncertainty Quantification
2 days, 6 hours ago |
arxiv.org
CARE: Extracting Experimental Findings From Clinical Literature
2 days, 6 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineering Manager, Generative AI - Characters
@ Meta | Bellevue, WA | Menlo Park, CA | Seattle, WA | New York City | San Francisco, CA
Senior Operations Research Analyst / Predictive Modeler
@ LinQuest | Colorado Springs, Colorado, United States