all AI news
ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
April 16, 2024, 4:43 a.m. | Otto Brookes, Majid Mirmehdi, Hjalmar Kuhl, Tilo Burghardt
cs.LG updates on arXiv.org arxiv.org
Abstract: We show that chimpanzee behaviour understanding from camera traps can be enhanced by providing visual architectures with access to an embedding of text descriptions that detail species behaviours. In particular, we present a vision-language model which employs multi-modal decoding of visual features extracted directly from camera trap videos to process query tokens representing behaviours and output class predictions. Query tokens are initialised using a standardised ethogram of chimpanzee behaviour, rather than using random or name-based …
abstract access architectures arxiv cs.ai cs.cv cs.lg decoding embedding features language language model modal multi-modal process recognition show species text type understanding videos vision visual
More from arxiv.org / cs.LG updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Data Engineer
@ Quantexa | Sydney, New South Wales, Australia
Staff Analytics Engineer
@ Warner Bros. Discovery | NY New York 230 Park Avenue South