all AI news
Egocentric Deep Multi-Channel Audio-Visual Active Speaker Localization. (arXiv:2201.01928v1 [cs.CV])
Jan. 7, 2022, 2:10 a.m. | Hao Jiang, Calvin Murdock, Vamsi Krishna Ithapu
cs.CV updates on arXiv.org arxiv.org
Augmented reality devices have the potential to enhance human perception and
enable other assistive functionalities in complex conversational environments.
Effectively capturing the audio-visual context necessary for understanding
these social interactions first requires detecting and localizing the voice
activities of the device wearer and the surrounding people. These tasks are
challenging due to their egocentric nature: the wearer's head motion may cause
motion blur, surrounding people may appear in difficult viewing angles, and
there may be occlusions, visual clutter, audio noise, …
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Senior Manager, IT Ops & Service Management, AI/ML
@ Sephora | San Francisco, CA, US, 50302863
AI/ML Senior Software Engineer (Indonesia)
@ Bjak | Jakarta, Jakarta, Indonesia
Data Engineer
@ Accenture Federal Services | Laurel, MD
Principal Engineer, Deep Learning
@ Outrider | Montreal, Quebec
Consultant Data manager F/H
@ Atos | Bezons, FRANCE, FR, 95870