April 27, 2023, 3:55 p.m. | /u/madredditscientist

Data Science www.reddit.com

I got frustrated with the time and effort required to code and maintain custom web scrapers, so me and my friends built a generic LLM-based solution for data extraction from websites. AI should automate tedious and un-creative work, and web scraping definitely fits this description.

We're leveraging LLMs to semantically understand websites and generate the DOM selectors for it. Using GPT for every data extraction, as most comparable tools do, would be way too expensive and very slow, but using …

automate code creative data data extraction datascience dom extraction gpt llm llms scraping solution tool tools web web scraping website websites work

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Alternance DATA/AI Engineer (H/F)

@ SQLI | Le Grand-Quevilly, France