You don't need a personality test to know these models are unreliable: Assessing the Reliability of Large Language Models on Psychometric Instruments | allainews.com

April 3, 2024, 4:47 a.m. | Bangzhao Shu, Lechen Zhang, Minje Choi, Lavinia Dunagan, Lajanugen Logeswaran, Moontae Lee, Dallas Card, David Jurgens

cs.CL updates on arXiv.org arxiv.org

arXiv:2311.09718v2 Announce Type: replace
Abstract: The versatility of Large Language Models (LLMs) on natural language understanding tasks has made them popular for research in social sciences. To properly understand the properties and innate personas of LLMs, researchers have performed studies that involve using prompts in the form of questions that ask LLMs about particular opinions. In this study, we take a cautionary step back and examine whether the current format of prompting LLMs elicits responses in a consistent and robust …

abstract arxiv cs.ai cs.cl language language models language understanding large language large language models llms natural natural language personality personas popular reliability research researchers social social sciences tasks test them type understanding

More from arxiv.org / cs.CL updates on arXiv.org

Hijacking Context in Large Multi-modal Models 5 hours ago | arxiv.org

abstract arxiv contents context +16

The Janus Interface: How Fine-Tuning in Large Language Models Amplifies the Privacy Risks 5 hours ago | arxiv.org

abstract arxiv concerns cs.cl +21

Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias 5 hours ago | arxiv.org

abstract accuracy agent arxiv +28

Small Language Model Can Self-correct 5 hours ago | arxiv.org

abstract arxiv capability chatgpt +18

Prompt-based mental health screening from social media text 5 hours ago | arxiv.org

abstract article arxiv bag +17

Scaling Political Texts with Large Language Models: Asking a Chatbot Might Be All You Need 5 hours ago | arxiv.org

abstract arxiv author chatbot +20

Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis 5 hours ago | arxiv.org

analysis arxiv attribution bias +10

Natural Language Interfaces for Tabular Data Querying and Visualization: A Survey 5 hours ago | arxiv.org

abstract arxiv chatgpt cs.ai +27

Hidden Citations Obscure True Impact in Science 5 hours ago | arxiv.org

abstract arxiv citations clear +19

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net

Research Engineer

@ Allora Labs | Remote

View on ai-jobs.net

Ecosystem Manager

@ Allora Labs | Remote

View on ai-jobs.net

Founding AI Engineer, Agents

@ Occam AI | New York

View on ai-jobs.net