all AI news
Transforming LLMs into Cross-modal and Cross-lingual RetrievalSystems
April 3, 2024, 4:46 a.m. | Frank Palma Gomez, Ramon Sanabria, Yun-hsuan Sung, Daniel Cer, Siddharth Dalmia, Gustavo Hernandez Abrego
cs.CL updates on arXiv.org arxiv.org
Abstract: Large language models (LLMs) are trained on text-only data that go far beyond the languages with paired speech and text data. At the same time, Dual Encoder (DE) based retrieval systems project queries and documents into the same embedding space and have demonstrated their success in retrieval and bi-text mining. To match speech and text in many languages, we propose using LLMs to initialize multi-modal DE retrieval systems. Unlike traditional methods, our system doesn't require …
abstract arxiv beyond cross-lingual cs.cl cs.ir cs.sd data documents eess.as embedding encoder language language models languages large language large language models llms modal project queries retrieval space speech success systems text type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Consultant - Artificial Intelligence & Data (Google Cloud Data Engineer) - MY / TH
@ Deloitte | Kuala Lumpur, MY