Aug. 11, 2023, 6:52 a.m. | Yuanhong Chen, Yuyuan Liu, Hu Wang, Fengbei Liu, Chong Wang, Gustavo Carneiro

cs.CV updates on arXiv.org arxiv.org

Audio-visual segmentation (AVS) is a complex task that involves accurately
segmenting the corresponding sounding object based on audio-visual queries.
Successful audio-visual learning requires two essential components: 1) an
unbiased dataset with high-quality pixel-level multi-class labels, and 2) a
model capable of effectively linking audio information with its corresponding
visual object. However, these two requirements are only partially addressed by
current methods, with training sets containing biased audio-visual data, and
models that generalise poorly beyond this biased training set. In this …

arxiv audio closer look components dataset information labels look pixel quality segmentation semantic unbiased

Data Engineer

@ Lemon.io | Remote: Europe, LATAM, Canada, UK, Asia, Oceania

Artificial Intelligence – Bioinformatic Expert

@ University of Texas Medical Branch | Galveston, TX

Lead Developer (AI)

@ Cere Network | San Francisco, US

Research Engineer

@ Allora Labs | Remote

Ecosystem Manager

@ Allora Labs | Remote

Founding AI Engineer, Agents

@ Occam AI | New York