all AI news
MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering
April 9, 2024, 4:51 a.m. | I\~nigo Alonso, Maite Oronoz, Rodrigo Agerri
cs.CL updates on arXiv.org arxiv.org
Abstract: Large Language Models (LLMs) have the potential of facilitating the development of Artificial Intelligence technology to assist medical experts for interactive decision support, which has been demonstrated by their competitive performances in Medical QA. However, while impressive, the required quality bar for medical applications remains far from being achieved. Currently, LLMs remain challenged by outdated knowledge and by their tendency to generate hallucinated content. Furthermore, most benchmarks to assess medical knowledge lack reference gold explanations …
abstract applications artificial artificial intelligence artificial intelligence technology arxiv benchmarking cs.cl decision decision support development experts however intelligence interactive language language models large language large language models llms medical multilingual performances quality question question answering support technology type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US