all AI news
Clustering Vietnamese Conversations From Facebook Page To Build Training Dataset For Chatbot. (arXiv:2112.15338v2 [cs.CL] UPDATED)
Jan. 4, 2022, 9:10 p.m. | Trieu Hai Nguyen, Thi-Kim-Ngoan Pham, Thi-Hong-Minh Bui, Thanh-Quynh-Chau Nguyen
cs.CL updates on arXiv.org arxiv.org
The biggest challenge of building chatbots is training data. The required
data must be realistic and large enough to train chatbots. We create a tool to
get actual training data from Facebook messenger of a Facebook page. After text
preprocessing steps, the newly obtained dataset generates FVnC and Sample
dataset. We use the Retraining of BERT for Vietnamese (PhoBERT) to extract
features of our text data. K-Means and DBSCAN clustering algorithms are used
for clustering tasks based on output embeddings …
More from arxiv.org / cs.CL updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Data Architect
@ Western Digital | San Jose, CA, United States
Senior Data Scientist GenAI (m/w/d)
@ Deutsche Telekom | Bonn, Deutschland
Senior Data Engineer, Telco (Remote)
@ Lightci | Toronto, Ontario
Consultant Data Architect/Engineer H/F - Innovative Tech
@ Devoteam | Lyon, France
(Senior) ML Engineer / Software Engineer Machine Learning & AI (m/f/x) onsite or remote (in Germany or Austria)
@ Scalable GmbH | Wien, Germany