May 22, 2024, 4:47 a.m. | Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.12775v1 Announce Type: cross
Abstract: Discovering the semantics of multimodal utterances is essential for understanding human language and enhancing human-machine interactions. Existing methods manifest limitations in leveraging nonverbal information for discerning complex semantics in unsupervised scenarios. This paper introduces a novel unsupervised multimodal clustering method (UMC), making a pioneering contribution to this field. UMC introduces a unique approach to constructing augmentation views for multimodal data, which are then used to perform pre-training to establish well-initialized representations for subsequent clustering. An …

arxiv clustering cs.ai cs.cl cs.mm discovery multimodal semantics type unsupervised

Senior Data Engineer

@ Displate | Warsaw

Senior Principal Software Engineer

@ Oracle | Columbia, MD, United States

Software Engineer for Manta Systems

@ PXGEO | Linköping, Östergötland County, Sweden

DevOps Engineer

@ Teradyne | Odense, DK

LIDAR System Engineer Trainee

@ Valeo | PRAGUE - PRA2

Business Applications Administrator

@ Allegro | Poznań, Poland