all AI news
Automatic Answerability Evaluation for Question Generation
Feb. 27, 2024, 5:50 a.m. | Zifan Wang, Kotaro Funakoshi, Manabu Okumura
cs.CL updates on arXiv.org arxiv.org
Abstract: Conventional automatic evaluation metrics, such as BLEU and ROUGE, developed for natural language generation (NLG) tasks, are based on measuring the n-gram overlap between the generated and reference text. These simple metrics may be insufficient for more complex tasks, such as question generation (QG), which requires generating questions that are answerable by the reference answers. Developing a more sophisticated automatic evaluation metric, thus, remains an urgent problem in QG research. This work proposes PMAN (Prompting-based …
abstract arxiv bleu cs.cl evaluation evaluation metrics generated language language generation measuring metrics natural natural language natural language generation nlg question questions reference simple tasks text type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Data Engineer (m/f/d)
@ Project A Ventures | Berlin, Germany
Principle Research Scientist
@ Analog Devices | US, MA, Boston