Stick to your Role! Stability of Personal Values Expressed in Large Language Models | allainews.com

Feb. 26, 2024, 5:42 a.m. | Grgur Kova\v{c}, R\'emy Portelas, Masataka Sawayama, Peter Ford Dominey, Pierre-Yves Oudeyer

cs.LG updates on arXiv.org arxiv.org

arXiv:2402.14846v1 Announce Type: cross
Abstract: The standard way to study Large Language Models (LLMs) through benchmarks or psychology questionnaires is to provide many different queries from similar minimal contexts (e.g. multiple choice questions). However, due to LLM's highly context-dependent nature, conclusions from such minimal-context evaluations may be little informative about the model's behavior in deployment (where it will be exposed to many new contexts). We argue that context-dependence should be studied as another dimension of LLM comparison alongside others such …

abstract arxiv benchmarks context cs.ai cs.cl cs.lg language language models large language large language models llm llms multiple nature psychology queries questions role stability standard study through type values

More from arxiv.org / cs.LG updates on arXiv.org

Tao: Re-Thinking DL-based Microarchitecture Simulation 48 minutes ago | arxiv.org

abstract arxiv cs.ar cs.lg +12

Towards a Systems Theory of Algorithms 48 minutes ago | arxiv.org

abstract algorithms arxiv code +16

Object Detection for Automated Coronary Artery Using Deep Learning 48 minutes ago | arxiv.org

abstract arxiv automated cs.cv +21

On the Role of the Action Space in Robot Manipulation Learning and Sim-to-Real Transfer 48 minutes ago | arxiv.org

abstract agents arxiv cs.lg +16

Computer Vision for Increased Operative Efficiency via Identification of Instruments in the Neurosurgical Operating Room: … 48 minutes ago | arxiv.org

abstract artificial artificial intelligence arxiv +18

A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization 48 minutes ago | arxiv.org

abstract applications arxiv case +16

nach0: Multimodal Natural and Chemical Languages Foundation Model 48 minutes ago | arxiv.org

abstract arxiv biomedical creative +24

How good are Large Language Models on African Languages? 48 minutes ago | arxiv.org

abstract arxiv context cs.ai +19

Using Skew to Assess the Quality of GAN-generated Image Features 48 minutes ago | arxiv.org

abstract advancement adversarial arxiv +20

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Senior Principal, Product Strategy Operations, Cloud Data Analytics

@ Google | Sunnyvale, CA, USA; Austin, TX, USA

View on ai-jobs.net

Data Scientist - HR BU

@ ServiceNow | Hyderabad, India

View on ai-jobs.net