March 7, 2024, 6:35 p.m. | Kaggle

Kaggle www.youtube.com

About this project: The Yoruba-RAG project focuses on enhancing the performance of large language models, like GPT-3, when handling questions in low-resource languages like Yoruba. The project involves web scraping from a Yoruba blog using Beautiful Soup, storing the data in a text file, and dividing it into smaller chunks. To effectively process Yoruba text, the Language-agnostic BERT Sentence Embedding (LaBSE) model is employed, and the results are stored in a Chroma database. This enriched database significantly improves GPT's ability …

blog data file gpt gpt-3 joshua kaggle language language models languages large language large language models low performance project questions rag scraping text web web scraping

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US