March 12, 2024, 8:19 a.m. | /u/Aggravating-Floor-38

Natural Language Processing www.reddit.com

I'm working on a RAG system that doesn't have a pre-built document corpus, and instead scrapes the internet for information in real time. It seemed like a pretty simple task, but I'm having trouble with the web-scraping aspect. I'm pretty new to any sort of scraping so I need to get an idea of this - is it a pretty easy task to scrape Google search - like scraping the top 5 links of 10 different search queries? I feel …

document information internet languagetechnology rag scraping simple web

Founding AI Engineer, Agents

@ Occam AI | New York

AI Engineer Intern, Agents

@ Occam AI | US

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne