June 6, 2024, 4:52 a.m. | Keun Soo Yim

cs.CL updates on arXiv.org arxiv.org

arXiv:2406.02943v1 Announce Type: cross
Abstract: Task-oriented queries (e.g., one-shot queries to play videos, order food, or call a taxi) are crucial for assessing the quality of virtual assistants, chatbots, and other large language model (LLM)-based services. However, a standard benchmark for task-oriented queries is not yet available, as existing benchmarks in the relevant NLP (Natural Language Processing) fields have primarily focused on task-oriented dialogues. Thus, we present a new methodology for efficiently generating the Task-oriented Queries Benchmark (ToQB) using existing …

abstract arxiv assistants benchmark benchmarks call chatbots cs.ai cs.cl cs.hc cs.ir cs.ne food however language language model large language large language model llm natural nlp quality queries services standard taxi type videos virtual virtual assistants

