all AI news
Efficient Document Embeddings via Self-Contrastive Bregman Divergence Learning
March 27, 2024, 4:48 a.m. | Daniel Saggau, Mina Rezaei, Bernd Bischl, Ilias Chalkidis
cs.CL updates on arXiv.org arxiv.org
Abstract: Learning quality document embeddings is a fundamental problem in natural language processing (NLP), information retrieval (IR), recommendation systems, and search engines. Despite recent advances in the development of transformer-based models that produce sentence embeddings with self-contrastive learning, the encoding of long documents (Ks of words) is still challenging with respect to both efficiency and quality considerations. Therefore, we train Longfomer-based document encoders using a state-of-the-art unsupervised contrastive learning method (SimCSE). Further on, we complement the …
abstract advances arxiv cs.cl development divergence document documents embeddings encoding information language language processing natural natural language natural language processing nlp processing quality recommendation recommendation systems retrieval search systems transformer type via words
More from arxiv.org / cs.CL updates on arXiv.org
Benchmarking LLMs via Uncertainty Quantification
2 days, 5 hours ago |
arxiv.org
CARE: Extracting Experimental Findings From Clinical Literature
2 days, 5 hours ago |
arxiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Research Scientist
@ Meta | Menlo Park, CA
Principal Data Scientist
@ Mastercard | O'Fallon, Missouri (Main Campus)