all AI news
Laying Anchors: Semantically Priming Numerals in Language Modeling
April 3, 2024, 4:42 a.m. | Mandar Sharma, Rutuja Murlidhar Taware, Pravesh Koirala, Nikhil Muralidhar, Naren Ramakrishnan
cs.LG updates on arXiv.org arxiv.org
Abstract: Off-the-shelf pre-trained language models have become the de facto standard in NLP pipelines for a multitude of downstream tasks. However, the inability of these models to properly encode numerals limits their performance on tasks requiring numeric comprehension. We introduce strategies to semantically prime numerals in any corpus by generating anchors governed by the distribution of numerals in said corpus, thereby enabling mathematically grounded representations of these numeral tokens. We establish the superiority of our proposed …
abstract anchors arxiv become cs.ai cs.cl cs.lg encode however language language models modeling nlp performance pipelines prime standard strategies tasks type
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Software Engineer, Machine Learning (Tel Aviv)
@ Meta | Tel Aviv, Israel
Senior Data Scientist- Digital Government
@ Oracle | CASABLANCA, Morocco