all AI news
REWIND Dataset: Privacy-preserving Speaking Status Segmentation from Multimodal Body Movement Signals in the Wild
March 5, 2024, 2:43 p.m. | Jose Vargas Quiros, Chirag Raman, Stephanie Tan, Ekin Gedik, Laura Cabrera-Quiros, Hayley Hung
cs.LG updates on arXiv.org arxiv.org
Abstract: Recognizing speaking in humans is a central task towards understanding social interactions. Ideally, speaking would be detected from individual voice recordings, as done previously for meeting scenarios. However, individual voice recordings are hard to obtain in the wild, especially in crowded mingling scenarios due to cost, logistics, and privacy concerns. As an alternative, machine learning models trained on video and wearable sensor data make it possible to recognize speech by detecting its related gestures in …
abstract arxiv cs.ai cs.cv cs.lg dataset eess.sp humans interactions multimodal privacy segmentation social speaking type understanding voice
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Lead Developer (AI)
@ Cere Network | San Francisco, US
Research Engineer
@ Allora Labs | Remote
Ecosystem Manager
@ Allora Labs | Remote
Founding AI Engineer, Agents
@ Occam AI | New York
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote