all AI news
Korean Online Hate Speech Dataset for Multilabel Classification: How Can Social Science Aid Developing Better Hate Speech Dataset?. (arXiv:2204.03262v1 [cs.CL])
April 8, 2022, 1:11 a.m. | TaeYoung Kang, Eunrang Kwon, Junbum Lee, Youngeun Nam, Junmo Song, JeongKyu Suh
cs.CL updates on arXiv.org arxiv.org
We suggest a multilabel Korean online hate speech dataset that covers seven
categories of hate speech: (1) Race and Nationality, (2) Religion, (3)
Regionalism, (4) Ageism, (5) Misogyny, (6) Sexual Minorities, and (7) Male. Our
35K dataset consists of 24K online comments with Krippendorff's Alpha label
accordance of .713, 2.2K neutral sentences from Wikipedia, 1.7K additionally
labeled sentences generated by the Human-in-the-Loop procedure and
rule-generated 7.1K neutral sentences. The base model with 24K initial dataset
achieved the accuracy of LRAP …
arxiv classification dataset hate speech science social social science speech
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Senior Machine Learning Engineer
@ GPTZero | Toronto, Canada
ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)
@ HelloBetter | Remote
Doctoral Researcher (m/f/div) in Automated Processing of Bioimages
@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena
Seeking Developers and Engineers for AI T-Shirt Generator Project
@ Chevon Hicks | Remote
Senior Machine Learning Engineer
@ BlackStone eIT | Egypt - Remote
Machine Learning Engineer - 2
@ Parspec | Bengaluru, India