all AI news
Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs
April 22, 2024, 4:47 a.m. | Clemencia Siro, Mohammad Aliannejadi, Maarten de Rijke
cs.CL updates on arXiv.org arxiv.org
Abstract: In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback. In a conversational setting such signals are usually unavailable due to the nature of the interactions, and, instead, the evaluation often relies on crowdsourced evaluation labels. The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied. We focus on how the evaluation of task-oriented dialogue systems (TDSs), is affected by considering user feedback, explicit or …
abstract arxiv conversational cs.cl cs.ir dialogue effects evaluation feedback interactions labels llms nature retrieval role systems type user feedback
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne