Nov. 27, 2023, 12:42 p.m. | Andrej Baranovskij

Andrej Baranovskij

I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready to be consumed by LLM as part of the RAG infrastructure.

GitHub repo:

0:00 Intro
1:19 Libs
1:54 Ingest data with ChromaDB
6:17 Main script
6:59 Pipeline with LangChain
9:00 Testing and Summary

- Subscribe to this YouTube channel
- Twitter:
- LinkedIn: …

api chroma chromadb data documents easy github github repo implementation infrastructure intro invoice invoice processing langchain llm part pdf pipeline process processing rag store through tutorial vector vector store

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

IT Commercial Data Analyst - ESO

@ National Grid | Warwick, GB, CV34 6DA

Stagiaire Data Analyst – Banque Privée - Juillet 2024

@ Rothschild & Co | Paris (Messine-29)

Operations Research Scientist I - Network Optimization Focus

@ CSX | Jacksonville, FL, United States

Machine Learning Operations Engineer

@ Intellectsoft | Baku, Baku, Azerbaijan - Remote

Data Analyst

@ Health Care Service Corporation | Richardson Texas HQ (1001 E. Lookout Drive)