April 15, 2024, 4:42 a.m. | Rita Gonz\'alez-M\'arquez, Dmitry Kobak

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.08403v1 Announce Type: cross
Abstract: The ICLR conference is unique among the top machine learning conferences in that all submitted papers are openly available. Here we present the ICLR dataset consisting of abstracts of all 24 thousand ICLR submissions from 2017-2024 with meta-data, decision scores, and custom keyword-based labels. We find that on this dataset, bag-of-words representation outperforms most dedicated sentence transformer models in terms of $k$NN classification accuracy, and the top performing language models barely outperform TF-IDF. We see …

abstract arxiv conference conferences cs.cl cs.dl cs.lg data dataset decision iclr labels machine machine learning meta openly papers type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Data Scientist

@ Publicis Groupe | New York City, United States

Bigdata Cloud Developer - Spark - Assistant Manager

@ State Street | Hyderabad, India