all AI news
RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain
March 22, 2024, 4:42 a.m. | William James Bolton, Rafael Poyiadzi, Edward R. Morrell, Gabriela van Bergen Gonzalez Bueno, Lea Goetz
cs.LG updates on arXiv.org arxiv.org
Abstract: Large Language Models (LLMs) increasingly support applications in a wide range of domains, some with potential high societal impact such as biomedicine, yet their reliability in realistic use cases is under-researched. In this work we introduce the Reliability AssesMent for Biomedical LLM Assistants (RAmBLA) framework and evaluate whether four state-of-the-art foundation LLMs can serve as reliable assistants in the biomedical domain. We identify prompt robustness, high recall, and a lack of hallucinations as necessary criteria …
abstract applications arxiv assistants biomedical biomedicine cases cs.ai cs.lg domain domains framework impact language language models large language large language models llms reliability support type use cases work
More from arxiv.org / cs.LG updates on arXiv.org
Digital Over-the-Air Federated Learning in Multi-Antenna Systems
2 days, 12 hours ago |
arxiv.org
Bagging Provides Assumption-free Stability
2 days, 12 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Research Scientist, Demography and Survey Science, University Grad
@ Meta | Menlo Park, CA | New York City
Computer Vision Engineer, XR
@ Meta | Burlingame, CA