Jan. 20, 2022, 2:10 a.m. | Hang Jiang, Yining Hua, Doug Beeferman, Deb Roy

cs.CL updates on arXiv.org arxiv.org

Social media data such as Twitter messages ("tweets") pose a particular
challenge to NLP systems because of their short, noisy, and colloquial nature.
Tasks such as Named Entity Recognition (NER) and syntactic parsing require
highly domain-matched training data for good performance. While there are some
publicly available annotated datasets of tweets, they are all purpose-built for
solving one task at a time. As yet there is no complete training corpus for
both syntactic analysis (e.g., part of speech tagging, dependency …

analysis arxiv building media nlp social social media

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Principal Machine Learning Engineer (AI, NLP, LLM, Generative AI)

@ Palo Alto Networks | Santa Clara, CA, United States

Consultant Senior Data Engineer F/H

@ Devoteam | Nantes, France