GPT-4's assessment of its performance in a USMLE-based case study | allainews.com

Feb. 16, 2024, 5:47 a.m. | Uttam Dhakal, Aniket Kumar Singh, Suman Devkota, Yogesh Sapkota, Bishal Lamichhane, Suprinsa Paudyal, Chandra Dhakal

cs.CL updates on arXiv.org arxiv.org

arXiv:2402.09654v1 Announce Type: cross
Abstract: This study investigates GPT-4's assessment of its performance in healthcare applications. A simple prompting technique was used to prompt the LLM with questions taken from the United States Medical Licensing Examination (USMLE) questionnaire and it was tasked to evaluate its confidence score before posing the question and after asking the question. The questionnaire was categorized into two groups-questions with feedback (WF) and questions with no feedback(NF) post-question. The model was asked to provide absolute and …

abstract applications arxiv assessment case case study confidence cs.ai cs.cl cs.hc gpt gpt-4 healthcare licensing llm medical performance prompt prompting questions simple study type united united states usmle

More from arxiv.org / cs.CL updates on arXiv.org

Knowledge Graphs and Pre-trained Language Models enhanced Representation Learning for Conversational Recommender Systems 19 hours ago | arxiv.org

abstract arxiv context conversation +20

ProCoT: Stimulating Critical Thinking and Writing of Students through Engagement with Large Language Models (LLMs) 19 hours ago | arxiv.org

abstract active learning arxiv chatgpt +22

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations 19 hours ago | arxiv.org

abstract arxiv commonsense cs.cl +10

Response: Emergent analogical reasoning in large language models 19 hours ago | arxiv.org

abstract acquired analogy arxiv +16

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization 19 hours ago | arxiv.org

abstract agents arxiv autonomous +18

NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance 19 hours ago | arxiv.org

abstract arxiv chinese cs.ce +25

CookingSense: A Culinary Knowledgebase with Multidisciplinary Assertions 19 hours ago | arxiv.org

abstract acquired arxiv collection +17

GOLD: Geometry Problem Solver with Natural Language Description 19 hours ago | arxiv.org

abstract artificial artificial intelligence arxiv +22

Enhancing Surgical Robots with Embodied Intelligence for Autonomous Ultrasound Scanning 19 hours ago | arxiv.org

abstract arxiv autonomous cs.ai +17

AI Research Scientist

@ Vara | Berlin, Germany and Remote

View on ai-jobs.net

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Business Data Analyst

@ Alstom | Johannesburg, GT, ZA

View on ai-jobs.net