Nov. 27, 2023, 12:42 p.m. | Andrej Baranovskij

Andrej Baranovskij

I explain the implementation of the pipeline to process invoice data from PDF documents. The data is loaded into Chroma DB's vector store. Through LangChain API, the data from the vector store is ready to be consumed by LLM as part of the RAG infrastructure.

GitHub repo:

0:00 Intro
1:19 Libs
1:54 Ingest data with ChromaDB
6:17 Main script
6:59 Pipeline with LangChain
9:00 Testing and Summary

