April 10, 2024, 4:21 a.m. | Robert Martin-Short

Towards Data Science - Medium towardsdatascience.com

DALLE-2’s interpretation of “A futuristic industrial document scanning facility”

Use LangChain and OpenAI tools to extract structured information from images of receipts stored in Google Drive

This article details how we can use open source Python packages such as LangChain, pytesseract and PyPDF, along with gpt-4-vision and gpt-3.5-turbo, to identify and extract key information from images of receipts. The resulting dataset could be used for a “chat to receipts” application. Check out the full code here.

Paper receipts come …

information extraction langchain llm openai python

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne