March 26, 2024, 4:46 a.m. | Lanxin Xu, Shuo Wang

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.15693v1 Announce Type: new
Abstract: In this report, we introduce a novel self-supervised learning method for extracting latent embeddings from behaviors of larval zebrafish. Drawing inspiration from Masked Modeling techniquesutilized in image processing with Masked Autoencoders (MAE) \cite{he2022masked} and in natural language processing with Generative Pre-trained Transformer (GPT) \cite{radford2018improving}, we treat behavior sequences as a blend of images and language. For the skeletal sequences of swimming zebrafish, we propose a pioneering Transformer-CNN architecture, the Sequence Spatial-Temporal Transformer (SSTFormer), designed to …

abstract arxiv autoencoders behavior cs.cv embeddings generative generative pre-trained transformer image image processing inspiration language language processing modeling natural natural language natural language processing novel processing report self-supervised learning supervised learning technical transformer type

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Reporting & Data Analytics Lead (Sizewell C)

@ EDF | London, GB

Data Analyst

@ Notable | San Mateo, CA