April 16, 2024, 4:51 a.m. | Xintong Wang, Xiaoyu Li, Xingshan Li, Chris Biemann

cs.CL updates on arXiv.org arxiv.org

arXiv:2310.05216v2 Announce Type: replace
Abstract: Large Language Models (LLMs) have emerged as dominant foundational models in modern NLP. However, the understanding of their prediction processes and internal mechanisms, such as feed-forward networks (FFN) and multi-head self-attention (MHSA), remains largely unexplored. In this work, we probe LLMs from a human behavioral perspective, correlating values from LLMs with eye-tracking measures, which are widely recognized as meaningful indicators of human reading patterns. Our findings reveal that LLMs exhibit a similar prediction pattern with …

abstract arxiv attention cs.cl foundational foundational models head however human language language models large language large language models llms modern multi-head networks nlp perspective prediction probe processes self-attention type understanding work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India