Jan. 1, 2024, 6:21 p.m. | /u/RuairiSpain

Natural Language Processing www.reddit.com

Hi




I work for Wikimedia/Wikipedia. I'm looking for suggestions and recommendations (and what to avoid) in ML for a NLP project.




Task:




Make Wikipedia search easier and add better structured data for Wikipedia




Phase 1 - Build a simple multi class classification system




Initial idea is to build a hybrid ML + heuristic rules to define a set of 30—80 top-level topic classes and tag each article with a set of class labels (with confidence scores).










Phase 2 - Add …

advice build classification graph knowledge knowledge graph languagetechnology llm llm rag nlp project rag recommendations search suggestions tools wikipedia work

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Data Scientist, gTech Ads

@ Google | Mexico City, CDMX, Mexico

Lead, Data Analytics Operations

@ Zocdoc | Pune, Maharashtra, India