all AI news
Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval. (arXiv:2201.05409v1 [cs.IR])
Jan. 17, 2022, 2:10 a.m. | Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Chaozhuo Li, Yingxia Shao, Defu Lian, Xing Xie, Hao Sun, Denvy Deng, Liangjie Zhang, Qi Zhang
cs.CL updates on arXiv.org arxiv.org
Ad-hoc search calls for the selection of appropriate answers from a
massive-scale corpus. Nowadays, the embedding-based retrieval (EBR) becomes a
promising solution, where deep learning based document representation and ANN
search techniques are allied to handle this task. However, a major challenge is
that the ANN index can be too large to fit into memory, given the considerable
size of answer corpus. In this work, we tackle this problem with Bi-Granular
Document Representation, where the lightweight sparse embeddings are indexed …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Analytics & Insight Specialist, Customer Success
@ Fortinet | Ottawa, ON, Canada
Account Director, ChatGPT Enterprise - Majors
@ OpenAI | Remote - Paris