May 22, 2024, 4:47 a.m. | Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao

cs.CL updates on arXiv.org arxiv.org

arXiv:2405.12775v1 Announce Type: cross
Abstract: Discovering the semantics of multimodal utterances is essential for understanding human language and enhancing human-machine interactions. Existing methods manifest limitations in leveraging nonverbal information for discerning complex semantics in unsupervised scenarios. This paper introduces a novel unsupervised multimodal clustering method (UMC), making a pioneering contribution to this field. UMC introduces a unique approach to constructing augmentation views for multimodal data, which are then used to perform pre-training to establish well-initialized representations for subsequent clustering. An …

arxiv clustering cs.ai cs.cl cs.mm discovery multimodal semantics type unsupervised

Senior Data Engineer

@ Displate | Warsaw

Analyst, Data Analytics

@ T. Rowe Price | Owings Mills, MD - Building 4

Regulatory Data Analyst

@ Federal Reserve System | San Francisco, CA

Sr. Data Analyst

@ Bank of America | Charlotte

Data Analyst- Tech Refresh

@ CACI International Inc | 1J5 WASHINGTON DC (BOLLING AFB)

Senior AML/CFT & Data Analyst

@ Ocorian | Ebène, Mauritius