April 17, 2024, 4:43 a.m. | Philippe Gervais, Asya Fadeeva, Andrii Maksai

cs.LG updates on arXiv.org arxiv.org

arXiv:2404.10690v1 Announce Type: cross
Abstract: We introduce MathWriting, the largest online handwritten mathematical expression dataset to date. It consists of 230k human-written samples and an additional 400k synthetic ones. MathWriting can also be used for offline HME recognition and is larger than all existing offline HME datasets like IM2LATEX-100K. We introduce a benchmark based on MathWriting data in order to advance research on both online and offline HME recognition.

abstract arxiv cs.cv cs.hc cs.lg dataset datasets human offline ones recognition samples synthetic type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Engineer

@ Quantexa | Sydney, New South Wales, Australia

Staff Analytics Engineer

@ Warner Bros. Discovery | NY New York 230 Park Avenue South