all AI news
[R] The Manga Whisperer: Automatically Generating Transcriptions for Comics
Jan. 20, 2024, 2:44 p.m. | /u/ragavsachdeva
Machine Learning www.reddit.com
Github: [https://github.com/ragavsachdeva/magi](https://github.com/ragavsachdeva/magi)
Try it yourself: [https://huggingface.co/spaces/ragavsachdeva/the-manga-whisperer/](https://huggingface.co/spaces/ragavsachdeva/the-manga-whisperer/)
TLDR: Given a high resolution manga page as input, Magi (our model) can (i) detect panels, character, text blocks, (ii) cluster characters (without making any assumptions about the number of ground truth clusters), (iii) match text blocks to their speakers, (iv) perform OCR, (v) generate a transcript of who said and when (by sorting the panels and text boxes in the reading order). See the figure below for an example.Wanted to share …
assumptions characters cluster generate iii machinelearning magi making manga match ocr page panels sorting speakers text truth
More from www.reddit.com / Machine Learning
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
RL Analytics - Content, Data Science Manager
@ Meta | Burlingame, CA
Research Engineer
@ BASF | Houston, TX, US, 77079