Hear to Segment: Unmixing the Audio to Guide the Semantic Segmentation. (arXiv:2305.07223v1 [cs.SD]) | allainews.com

May 15, 2023, 12:47 a.m. | Yuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang

cs.CV updates on arXiv.org arxiv.org

In this paper, we focus on a recently proposed novel task called Audio-Visual
Segmentation (AVS), where the fine-grained correspondence between audio stream
and image pixels is required to be established. However, learning such
correspondence faces two key challenges: (1) audio signals inherently exhibit a
high degree of information density, as sounds produced by multiple objects are
entangled within the same audio stream; (2) the frequency of audio signals from
objects with the same category tends to be similar, which hampers …

arxiv audio challenges fine-grained focus guide image novel paper pixels segmentation semantic

More from arxiv.org / cs.CV updates on arXiv.org

Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception 25 minutes ago | arxiv.org

agent arxiv autonomous cs.cl +8

Low-resolution Prior Equilibrium Network for CT Reconstruction 25 minutes ago | arxiv.org

abstract arxiv cs.cv deep learning +17

MARformer: An Efficient Metal Artifact Reduction Transformer for Dental CBCT Images 25 minutes ago | arxiv.org

abstract artifact arxiv cs.cv +16

Back to Basics: Fast Denoising Iterative Algorithm 25 minutes ago | arxiv.org

abstract algorithm arxiv basics +10

Predicting Thrombectomy Recanalization from CT Imaging Using Deep Learning Models 25 minutes ago | arxiv.org

abstract arxiv benefit clinicians +10

Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models 25 minutes ago | arxiv.org

abstract adversarial adversarial examples art +20

Methods and strategies for improving the novel view synthesis quality of neural radiation field 25 minutes ago | arxiv.org

abstract application arxiv attention +16

AffordanceLLM: Grounding Affordance from Vision Language Models 25 minutes ago | arxiv.org

arxiv cs.cv cs.ro language +3

DualFluidNet: an Attention-based Dual-pipeline Network for FLuid Simulation 25 minutes ago | arxiv.org

arxiv attention cs.cv cs.gr +4

Data Scientist (m/f/x/d)

@ Symanto Research GmbH & Co. KG | Spain, Germany

View on ai-jobs.net

AI Scientist/Engineer

@ OKX | Singapore

View on ai-jobs.net

Research Engineering/ Scientist Associate I

@ The University of Texas at Austin | AUSTIN, TX

View on ai-jobs.net

Senior Data Engineer

@ Algolia | London, England

View on ai-jobs.net

Fundamental Equities - Vice President, Equity Quant Research Analyst (Income & Value Investment Team)

@ BlackRock | NY7 - 50 Hudson Yards, New York

View on ai-jobs.net

Snowflake Data Analytics

@ Devoteam | Madrid, Spain

View on ai-jobs.net