April 20, 2024, 5:54 p.m. | /u/Ghulam_Nabi

Machine Learning www.reddit.com

https://preview.redd.it/q93ittqnaovc1.png?width=777&format=png&auto=webp&s=aec5c1767690bd3269ba9e601623e4d85378fd37

This is the image which i captured from the Pdf file, one thing the pdf text is selectable like I can select all the text written in heading and in tables as well. I have tried the couple of technique:



1: Used the MultiModalVectorStoreIndex using the llama-index-multi-modal-llms-openai (GPT4-API) by first converting the PDF into the Images using the OCR and then retrived the tables from the PDF but one thing I need to define the number of pages from …

api file gpt4 image images index llama llms machinelearning modal multi-modal ocr openai pdf tables text

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York