Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval | allainews.com

April 8, 2024, 4:46 a.m. | Jo\~ao Coelho, Bruno Martins, Jo\~ao Magalh\~aes, Jamie Callan, Chenyan Xiong

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.04163v1 Announce Type: cross
Abstract: This study investigates the existence of positional biases in Transformer-based models for text representation learning, particularly in the context of web document retrieval. We build on previous research that demonstrated loss of information in the middle of input sequences for causal language models, extending it to the domain of representation learning. We examine positional biases at various stages of training for an encoder-decoder model, including language model pre-training, contrastive pre-training, and contrastive fine-tuning. Experiments with …

abstract arxiv biases build causal context cs.cl cs.ir document documents embed information language language models loss representation representation learning research retrieval study text transformer transformer-based models type web

More from arxiv.org / cs.CL updates on arXiv.org

PRE: A Peer Review Based Large Language Model Evaluator 8 hours ago | arxiv.org

abstract academic arxiv attention +20

Word frequency and sentiment analysis of twitter messages during Coronavirus pandemic 8 hours ago | arxiv.org

abstract analysis arxiv conversation +27

LangBridge: Multilingual Reasoning Without Multilingual Supervision 8 hours ago | arxiv.org

abstract adapt arxiv cs.cl +13

The Critique of Critique 8 hours ago | arxiv.org

arxiv critique cs.ai cs.cl +2

Machine Mindset: An MBTI Exploration of Large Language Models 8 hours ago | arxiv.org

arxiv cs.cl exploration language +7

One-Shot Learning as Instruction Data Prospector for Large Language Models 8 hours ago | arxiv.org

abstract arxiv challenge clear +20

PsyEval: A Suite of Mental Health Related Tasks for Evaluating Large Language Models 8 hours ago | arxiv.org

abstract arxiv cs.cl domain +13

How Vocabulary Sharing Facilitates Multilingualism in LLaMA? 8 hours ago | arxiv.org

arxiv cs.ai cs.cl llama +3

UP4LS: User Profile Constructed by Multiple Attributes for Enhancing Linguistic Steganalysis 8 hours ago | arxiv.org

abstract aim arxiv attributes +20

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Data Analyst (Salesforce)

@ Lisinski Law Firm | Latin America

View on ai-jobs.net

Data Analyst

@ Fusemachines | India - Remote

View on ai-jobs.net