An End-to-End OCR Framework for Robust Arabic-Handwriting Recognition using a Novel Transformers-based Model and an Innovative 270 Million-Words Multi-Font Corpus of Classical Arabic with Diacritics. (arXiv:2208.11484v2 [cs.CV] UPDATED) | allainews.com

Aug. 30, 2022, 1:13 a.m. | Aly Mostafa, Omar Mohamed, Ali Ashraf, Ahmed Elbehery, Salma Jamal, Anas Salah, Amr S. Ghoneim

cs.CL updates on arXiv.org arxiv.org

This research is the second phase in a series of investigations on developing
an Optical Character Recognition (OCR) of Arabic historical documents and
examining how different modeling procedures interact with the problem. The
first research studied the effect of Transformers on our custom-built Arabic
dataset. One of the downsides of the first research was the size of the
training data, a mere 15000 images from our 30 million images, due to lack of
resources. Also, we add an image enhancement …

arxiv framework handwriting ocr transformers words

More from arxiv.org / cs.CL updates on arXiv.org

Gradient Flow of Energy: A General and Efficient Approach for Entity Alignment Decoding 10 hours ago | arxiv.org

abstract alignment arxiv cs.cl +19

Recommender Systems in the Era of Large Language Models (LLMs) 10 hours ago | arxiv.org

abstract applications arxiv become +23

EE-TTS: Emphatic Expressive TTS with Linguistic Information 10 hours ago | arxiv.org

abstract arxiv attention challenge +12

Raidar: geneRative AI Detection viA Rewriting 10 hours ago | arxiv.org

abstract ai detection ai-generated content ai-generated text +18

GeoGalactica: A Scientific Large Language Model in Geoscience 10 hours ago | arxiv.org

abstract applications arxiv cs.cl +25

MMC: Advancing Multimodal Chart Understanding with Large-scale Instruction Tuning 10 hours ago | arxiv.org

arxiv cs.ai cs.cl multimodal +3

ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models 10 hours ago | arxiv.org

arxiv cs.cl documents language +5

AudioChatLlama: Towards General-Purpose Speech Abilities for LLMs 10 hours ago | arxiv.org

abstract arxiv audio capabilities +17

Adapting Fake News Detection to the Era of Large Language Models 10 hours ago | arxiv.org

abstract adoption age arxiv +18

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

Associate Data Engineer

@ Redkite | London, England, United Kingdom

View on ai-jobs.net

Data Management Associate Consultant

@ SAP | Porto Salvo, PT, 2740-262

View on ai-jobs.net

NLP & Data Modelling Consultant - SAP LABS

@ SAP | Bengaluru, IN, 560066

View on ai-jobs.net

Catalog Data Quality Specialist

@ Delivery Hero | Montevideo, Uruguay

View on ai-jobs.net

Data Analyst for CEO Office with Pathway to Functional Analyst

@ Amar Bank | Jakarta

View on ai-jobs.net