April 15, 2024, 11:52 a.m. | Andrej Baranovskij

Andrej Baranovskij www.youtube.com

Using unstructured library to pre-process PDF document content, to be in a cleaner format. This helps LLM to produce more accurate response. JSON response is generated thanks to Nous Hermes 2 PRO LLM. Without any additional post-processing. Using Pydantic dynamic class to validate response to make sure it matches request.

Sparrow GitHub repo:
https://github.com/katanaml/sparrow

0:00 Intro
0:44 Example
2:35 Code - Requirements
3:06 Code - Config
4:26 Code - HTML text
5:10 Code - Agent
8:03 Summary

CONNECT:
- Subscribe …

class document dynamic format generated json langchain library llm llm rag nous nous hermes pdf post-processing process processing pydantic rag request unstructured

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Data Science Analyst- ML/DL/LLM

@ Mayo Clinic | Jacksonville, FL, United States

Machine Learning Research Scientist, Robustness and Uncertainty

@ Nuro, Inc. | Mountain View, California (HQ)