all AI news
Nougat: Neural Optical Understanding for Academic Documents - Meta AI 2023
Aug. 28, 2023, 9:20 p.m. | /u/Singularian2501
machinelearningnews www.reddit.com
Paper: [https://arxiv.org/abs/2308.13418](https://arxiv.org/abs/2308.13418)
Github: [https://github.com/facebookresearch/nougat](https://github.com/facebookresearch/nougat)
Abstract:
>Scientific knowledge is predominantly stored in books and scientific journals, often in the form of PDFs. However, the PDF format leads to a loss of semantic information, particularly for mathematical expressions. We propose Nougat (Neural Optical Understanding for Academic Documents), a Visual Transformer model that performs an Optical Character Recognition (OCR) task for processing scientific documents into a markup language, and demonstrate the effectiveness of our model on …
abstract academic books character recognition documents format information knowledge leads loss machinelearningnews ocr optical character recognition pdf processing recognition semantic transformer transformer model understanding
More from www.reddit.com / machinelearningnews
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Associate Data Engineer
@ Nominet | Oxford/ Hybrid, GB
Data Science Senior Associate
@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India