all AI news
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
April 16, 2024, 4:43 a.m. | Otto Brookes, Majid Mirmehdi, Hjalmar Kuhl, Tilo Burghardt
cs.LG updates on arXiv.org arxiv.org
Abstract: We show that chimpanzee behaviour understanding from camera traps can be enhanced by providing visual architectures with access to an embedding of text descriptions that detail species behaviours. In particular, we present a vision-language model which employs multi-modal decoding of visual features extracted directly from camera trap videos to process query tokens representing behaviours and output class predictions. Query tokens are initialised using a standardised ethogram of chimpanzee behaviour, rather than using random or name-based …
abstract access architectures arxiv cs.ai cs.cv cs.lg decoding embedding features language language model modal multi-modal process recognition show species text type understanding videos vision visual
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Software Engineer for AI Training Data (School Specific)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Python)
@ G2i Inc | Remote
Software Engineer for AI Training Data (Tier 2)
@ G2i Inc | Remote
Data Engineer
@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania
Artificial Intelligence – Bioinformatic Expert
@ University of Texas Medical Branch | Galveston, TX
Lead Developer (AI)
@ Cere Network | San Francisco, US