all AI news
Making Old Kurdish Publications Processable by Augmenting Available Optical Character Recognition Engines
April 10, 2024, 4:47 a.m. | Blnd Yaseen, Hossein Hassani
cs.CL updates on arXiv.org arxiv.org
Abstract: Kurdish libraries have many historical publications that were printed back in the early days when printing devices were brought to Kurdistan. Having a good Optical Character Recognition (OCR) to help process these publications and contribute to the Kurdish languages resources which is crucial as Kurdish is considered a low-resource language. Current OCR systems are unable to extract text from historical documents as they have many issues, including being damaged, very fragile, having many marks left …
abstract arxiv character recognition cs.cl devices good languages libraries making ocr optical optical character recognition printing process publications recognition resources type
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
GCP Data Engineer
@ Avant Digital | Delhi, DL, India