March 6, 2024, 5:45 a.m. | Yuxin Guo, Shijie Ma, Yuhao Zhao, Hu Su, Wei Zou

cs.CV updates on arXiv.org arxiv.org

arXiv:2403.03095v1 Announce Type: new
Abstract: Audio-Visual Source Localization (AVSL) is the task of identifying specific sounding objects in the scene given audio cues. In our work, we focus on semi-supervised AVSL with pseudo-labeling. To address the issues with vanilla hard pseudo-labels including bias accumulation, noise sensitivity, and instability, we propose a novel method named Cross Pseudo-Labeling (XPL), wherein two models learn from each other with the cross-refine mechanism to avoid bias accumulation. We equip XPL with two effective components. Firstly, …

abstract arxiv audio bias cs.cv cs.mm cs.sd eess.as focus labeling labels localization noise novel objects semi-supervised sensitivity type visual work

Software Engineer for AI Training Data (School Specific)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Python)

@ G2i Inc | Remote

Software Engineer for AI Training Data (Tier 2)

@ G2i Inc | Remote

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US