all AI news
textract-cli
March 30, 2024, 7:01 p.m. |
Simon Willison's Weblog simonwillison.net
This is my other OCR project from yesterday: I built the thinnest possible CLI wrapper around Amazon Textract, out of frustration at how hard that tool is to use on an ad-hoc basis.
It only works with JPEGs and PNGs (not PDFs) up to 5MB in size, reflecting limitations in Textract's synchronous API: it can handle PDFs amazingly well but you have to upload them to an S3 bucket yet and I decided to keep the scope tight for …
amazon api aws cli limitations ocr pdfs project projects textract tool wrapper
More from simonwillison.net / Simon Willison's Weblog
How an empty S3 bucket can make your AWS bill explode
1 day, 15 hours ago |
simonwillison.net
My approach to HTML web components
1 day, 15 hours ago |
simonwillison.net
Why SQLite Uses Bytecode
1 day, 20 hours ago |
simonwillison.net
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Data Scientist
@ Publicis Groupe | New York City, United States
Bigdata Cloud Developer - Spark - Assistant Manager
@ State Street | Hyderabad, India