April 24, 2024, 4:47 a.m. | Hansi Zeng, Chen Luo, Hamed Zamani

cs.CL updates on arXiv.org arxiv.org

arXiv:2404.14600v1 Announce Type: cross
Abstract: This paper introduces PAG-a novel optimization and decoding approach that guides autoregressive generation of document identifiers in generative retrieval models through simultaneous decoding. To this aim, PAG constructs a set-based and sequential identifier for each document. Motivated by the bag-of-words assumption in information retrieval, the set-based identifier is built on lexical tokens. The sequential identifier, on the other hand, is obtained via quantizing relevance-based representations of documents. Extensive experiments on MSMARCO and TREC Deep Learning …

abstract aim arxiv autoregressive bag cs.cl cs.ir decoding document generative generative retrieval guides information novel optimization paper planning retrieval set through type words

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne