all AI news
Comparing Performance of Different Linguistically-Backed Word Embeddings for Cyberbullying Detection. (arXiv:2206.01950v1 [cs.CL])
June 7, 2022, 1:12 a.m. | Juuso Eronen, Michal Ptaszynski, Fumito Masui
cs.CL updates on arXiv.org arxiv.org
In most cases, word embeddings are learned only from raw tokens or in some
cases, lemmas. This includes pre-trained language models like BERT. To
investigate on the potential of capturing deeper relations between lexical
items and structures and to filter out redundant information, we propose to
preserve the morphological, syntactic and other types of linguistic information
by combining them with the raw tokens or lemmas. This means, for example,
including parts-of-speech or dependency information within the used lexical
features. The …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Tableau/PowerBI Developer (A.Con)
@ KPMG India | Bengaluru, Karnataka, India
Software Engineer, Backend - Data Platform (Big Data Infra)
@ Benchling | San Francisco, CA