mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset. (arXiv:2108.13897v5 [cs.CL] UPDATED) | allainews.com

Aug. 18, 2022, 1:11 a.m. | Luiz Bonifacio, Vitor Jeronymo, Hugo Queiroz Abonizio, Israel Campiotti, Marzieh Fadaee, Roberto Lotufo, Rodrigo Nogueira

cs.CL updates on arXiv.org arxiv.org

The MS MARCO ranking dataset has been widely used for training deep learning
models for IR tasks, achieving considerable effectiveness on diverse zero-shot
scenarios. However, this type of resource is scarce in languages other than
English. In this work, we present mMARCO, a multilingual version of the MS
MARCO passage ranking dataset comprising 13 languages that was created using
machine translation. We evaluated mMARCO by finetuning monolingual and
multilingual reranking models, as well as a multilingual dense retrieval model
on …

arxiv dataset ranking

More from arxiv.org / cs.CL updates on arXiv.org

A Text Classification Framework for Simple and Effective Early Depression Detection Over Social Media Streams 20 hours ago | arxiv.org

abstract arxiv build classification +22

A Survey on Prompting Techniques in LLMs 20 hours ago | arxiv.org

abstract arxiv autoregressive cs.ai +24

Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis 20 hours ago | arxiv.org

abstract arxiv conversation cs.cl +21

ML-Bench: Evaluating Large Language Models for Code Generation in Repository-Level Machine Learning Tasks 20 hours ago | arxiv.org

arxiv code code generation cs.ai +9

Strings from the Library of Babel: Random Sampling as a Strong Baseline for Prompt Optimisation 20 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +16

Assessing Logical Puzzle Solving in Large Language Models: Insights from a Minesweeper Case Study 20 hours ago | arxiv.org

abstract arxiv case case study +19

Formal Aspects of Language Modeling 20 hours ago | arxiv.org

abstract artificial artificial intelligence arxiv +24

Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model 20 hours ago | arxiv.org

abstract advanced arxiv challenges +24

Predicting Emergent Abilities with Infinite Resolution Evaluation 20 hours ago | arxiv.org

abstract arxiv cs.cl evaluation +19

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Enterprise Data Quality, Senior Analyst

@ Toyota North America | Plano

View on ai-jobs.net

Data Analyst & Audit Management Software (AMS) Coordinator

@ World Vision | Philippines - Home Working

View on ai-jobs.net

Product Manager Power BI Platform Tech I&E Operational Insights

@ ING | HBP (Amsterdam - Haarlerbergpark)

View on ai-jobs.net

Sr. Director, Software Engineering, Clinical Data Strategy

@ Moderna | USA-Washington-Seattle-1099 Stewart Street

View on ai-jobs.net

Data Engineer (Data as a Service)

@ Xplor | Atlanta, GA, United States

View on ai-jobs.net