all AI news
CRUD-RAG: A Comprehensive Chinese Benchmark for Retrieval-Augmented Generation of Large Language Models. (arXiv:2401.17043v1 [cs.CL])
cs.CL updates on arXiv.org arxiv.org
Retrieval-Augmented Generation (RAG) is a technique that enhances the
capabilities of large language models (LLMs) by incorporating external
knowledge sources. This method addresses common LLM limitations, including
outdated information and the tendency to produce inaccurate "hallucinated"
content. However, the evaluation of RAG systems is challenging, as existing
benchmarks are limited in scope and diversity. Most of the current benchmarks
predominantly assess question-answering applications, overlooking the broader
spectrum of situations where RAG could prove advantageous. Moreover, they only
evaluate the performance …
arxiv benchmark capabilities chinese crud cs.cl evaluation information knowledge language language models large language large language models limitations llm llms rag retrieval retrieval-augmented systems