Web: http://arxiv.org/abs/2206.07981

June 17, 2022, 1:13 a.m. | Lianyang Ma, Yu Yao, Tao Liang, Tongliang Liu

cs.CV updates on arXiv.org arxiv.org

Multimodal sentiment analysis in videos is a key task in many real-world
applications, which usually requires integrating multimodal streams including
visual, verbal and acoustic behaviors. To improve the robustness of multimodal
fusion, some of the existing methods let different modalities communicate with
each other and modal the crossmodal interaction via transformers. However,
these methods only use the single-scale representations during the interaction
but forget to exploit multi-scale representations that contain different levels
of semantic information. As a result, the representations …

analysis arxiv cv multimodal scale sentiment analysis transformers videos

More from arxiv.org / cs.CV updates on arXiv.org

Machine Learning Researcher - Saalfeld Lab

@ Howard Hughes Medical Institute - Chevy Chase, MD | Ashburn, Virginia

Project Director, Machine Learning in US Health

@ ideas42.org | Remote, US

Data Science Intern

@ NannyML | Remote

Machine Learning Engineer NLP/Speech

@ Play.ht | Remote

Research Scientist, 3D Reconstruction

@ Yembo | Remote, US

Clinical Assistant or Associate Professor of Management Science and Systems

@ University at Buffalo | Buffalo, NY