all AI news
Driver Activity Classification Using Generalizable Representations from Vision-Language Models
April 24, 2024, 4:42 a.m. | Ross Greer, Mathias Viborg Andersen, Andreas M{\o}gelmose, Mohan Trivedi
cs.LG updates on arXiv.org arxiv.org
Abstract: Driver activity classification is crucial for ensuring road safety, with applications ranging from driver assistance systems to autonomous vehicle control transitions. In this paper, we present a novel approach leveraging generalizable representations from vision-language models for driver activity classification. Our method employs a Semantic Representation Late Fusion Neural Network (SRLF-Net) to process synchronized video frames from multiple perspectives. Each frame is encoded using a pretrained vision-language encoder, and the resulting embeddings are fused to generate …
abstract applications arxiv autonomous autonomous vehicle classification control cs.ai cs.cv cs.lg driver language language models novel paper representation road safety safety semantic systems transitions type vision vision-language vision-language models
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
AI Engineer Intern, Agents
@ Occam AI | US
AI Research Scientist
@ Vara | Berlin, Germany and Remote
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Data Engineer - Takealot Group (Takealot.com | Superbalist.com | Mr D Food)
@ takealot.com | Cape Town