all AI news
Multimodal active speaker detection and virtual cinematography for video conferencing. (arXiv:2002.03977v3 [eess.AS] UPDATED)
May 26, 2022, 1:11 a.m. | Ross Cutler, Ramin Mehran, Sam Johnson, Cha Zhang, Adam Kirk, Oliver Whyte, Adarsh Kowdle
stat.ML updates on arXiv.org arxiv.org
Active speaker detection (ASD) and virtual cinematography (VC) can
significantly improve the remote user experience of a video conference by
automatically panning, tilting and zooming of a video conferencing camera:
users subjectively rate an expert video cinematographer's video significantly
higher than unedited video. We describe a new automated ASD and VC that
performs within 0.3 MOS of an expert cinematographer based on subjective
ratings with a 1-5 scale. This system uses a 4K wide-FOV camera, a depth
camera, and a …
More from arxiv.org / stat.ML updates on arXiv.org
Jobs in AI, ML, Big Data
Senior ML Researcher - 3D Geometry Processing | 3D Shape Generation | 3D Mesh Data
@ Promaton | Europe
Senior Data Analyst - SQL
@ Experian | Heredia, Costa Rica
Lead Business Intelligence Developer
@ L.A. Care Health Plan | Los Angeles, CA, US, 90017
(USA) Senior Manager, Data Analytics
@ Walmart | (USA) AR BENTONVILLE Home Office J Street Offices, Suite #2
Autonomous Haulage System Application Specialist
@ Komatsu | Belo Horizonte, BR
Machine Learning Engineer
@ GFT Technologies | Alcobendas, M, ES, 28108