all AI news
Advancements in extracting tabular data from PDFs?
Oct. 10, 2023, 4:33 p.m. | /u/data_scallion
Data Science www.reddit.com
Is there a simple and robust method for extracting highly tabular data from a PDF without resorting to rule based regex parsing? I'm currently using PDFminer, PDFplumber and regex to build templates to extract PDFs based on the type of PDF but it's very time-consuming and tedious. Is there a better way?
I've used Langchain and OpenAI to build "Chat with your document" apps which works great for uploading a PDF of a whitepaper and asking it to …
build data datascience extract parsing pdf pdfminer regex simple tabular tabular data type
More from www.reddit.com / Data Science
Moving to eBay as a Data Science Analyst?
11 hours ago |
www.reddit.com
How much did your grad program help you get a job?
1 day, 7 hours ago |
www.reddit.com
How would you model this problem?
1 day, 13 hours ago |
www.reddit.com
Jobs in AI, ML, Big Data
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne