all AI news
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking
April 11, 2024, 4:45 a.m. | Raghav Goyal, Wan-Cyuan Fan, Mennatullah Siam, Leonid Sigal
cs.CV updates on arXiv.org arxiv.org
Abstract: Video Object Segmentation (VOS) has emerged as an increasingly important problem with availability of larger datasets and more complex and realistic settings, which involve long videos with global motion (e.g, in egocentric settings), depicting small objects undergoing both rigid and non-rigid (including state) deformations. While a number of recent approaches have been explored for this task, these data characteristics still present challenges. In this work we propose a novel, clip-based DETR-style encoder-decoder architecture, which focuses …
abstract arxiv availability cs.cv datasets global object objects scale segmentation small state tracking transformation transformer type video videos
More from arxiv.org / cs.CV updates on arXiv.org
Jobs in AI, ML, Big Data
Data Architect
@ University of Texas at Austin | Austin, TX
Data ETL Engineer
@ University of Texas at Austin | Austin, TX
Lead GNSS Data Scientist
@ Lurra Systems | Melbourne
Senior Machine Learning Engineer (MLOps)
@ Promaton | Remote, Europe
Senior Principal, Product Strategy Operations, Cloud Data Analytics
@ Google | Sunnyvale, CA, USA; Austin, TX, USA
Data Scientist - HR BU
@ ServiceNow | Hyderabad, India