Nov. 27, 2023, 12:42 p.m. | Andrej Baranovskij

Andrej Baranovskij www.youtube.com

I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready to be consumed by LLM as part of the RAG infrastructure.

GitHub repo:
https://github.com/katanaml/llm-ollama-invoice-cpu

0:00 Intro
1:19 Libs
1:54 Ingest data with ChromaDB
6:17 Main script
6:59 Pipeline with LangChain
9:00 Testing and Summary

CONNECT:
- Subscribe to this YouTube channel
- Twitter: https://twitter.com/andrejusb
- LinkedIn: …

api chroma chromadb data documents easy github github repo implementation infrastructure intro invoice invoice processing langchain llm part pdf pipeline process processing rag store through tutorial vector vector store

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Sr. VBI Developer II

@ Atos | Texas, US, 75093

Wealth Management - Data Analytics Intern/Co-op Fall 2024

@ Scotiabank | Toronto, ON, CA