Internet Based RAG - Scraping | allainews.com

March 12, 2024, 8:19 a.m. | /u/Aggravating-Floor-38

Natural Language Processing www.reddit.com

I'm working on a RAG system that doesn't have a pre-built document corpus, and instead scrapes the internet for information in real time. It seemed like a pretty simple task, but I'm having trouble with the web-scraping aspect. I'm pretty new to any sort of scraping so I need to get an idea of this - is it a pretty easy task to scrape Google search - like scraping the top 5 links of 10 different search queries? I feel …

document information internet languagetechnology rag scraping simple web

More from www.reddit.com / Natural Language Processing

Do Llamas Work in English? On the Latent Language of Multilingual Transformers 1 day, 22 hours ago | www.reddit.com

abstract bias colab english +19

How does the creative behavior of small models inform our understanding of the creative behavior … 2 days, 23 hours ago | www.reddit.com

creativity good information languagetechnology +8

Do I need graph database for this Entity Linking problem? 4 days, 23 hours ago | www.reddit.com

articles build business companies +14

Recommendation on NLP-tools and algorithms for modelling diachronic change in meaning? 6 days, 6 hours ago | www.reddit.com

algorithms change focus hello +11

What can I do during my NLP Master's program to best prepare me for top … 1 week ago | www.reddit.com

computer computer science languagetechnology master +4

Alternatives to Rasa? 1 week, 2 days ago | www.reddit.com

alternative chatbots database document +8

Can LLMs Consistently Deliver Comedy? 1 week, 3 days ago | www.reddit.com

comedy create filtering however +9

Topic modeling with short sentences 1 week, 3 days ago | www.reddit.com

algorithms data dataset kind +4

How big does a dataset have to be to fine-tune a transformer model for NER. 1 week, 5 days ago | www.reddit.com

bert big database dataset +15

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

View on ai-jobs.net

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

View on ai-jobs.net

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

View on ai-jobs.net

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

View on ai-jobs.net

Lead Developer (AI)

@ Cere Network | San Francisco, US

View on ai-jobs.net