May 1, 2024, 4:45 a.m. | Thomas Constum, Lucas Preel, Th\'eo Larcher, Pierrick Tranouez, Thierry Paquet, Sandra Br\'ee

cs.CV updates on arXiv.org arxiv.org

arXiv:2404.19329v1 Announce Type: new
Abstract: The EXO-POPP project aims to establish a comprehensive database comprising 300,000 marriage records from Paris and its suburbs, spanning the years 1880 to 1940, which are preserved in over 130,000 scans of double pages. Each marriage record may encompass up to 118 distinct types of information that require extraction from plain text. In this paper, we introduce the M-POPP dataset, a subset of the M-POPP database with annotations for full-page text recognition and information extraction …

abstract arxiv cs.cv database documents extraction information information extraction marriage paris project records scans type understanding

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US