Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation. (arXiv:2206.12772v2 [cs.CV] UPDATED) | allainews.com

Aug. 16, 2022, 1:13 a.m. | Jinxiang Liu, Chen Ju, Weidi Xie, Ya Zhang

cs.CV updates on arXiv.org arxiv.org

We present a simple yet effective self-supervised framework for audio-visual
representation learning, to localize the sound source in videos. To understand
what enables to learn useful representations, we systematically investigate the
effects of data augmentations, and reveal that (1) composition of data
augmentations plays a critical role, i.e. explicitly encouraging the
audio-visual representations to be invariant to various transformations~({\em
transformation invariance}); (2) enforcing geometric consistency substantially
improves the quality of learned representations, i.e. the detected sound source
should follow the …

arxiv cv sound transformation

More from arxiv.org / cs.CV updates on arXiv.org

Attention-Map Augmentation for Hypercomplex Breast Cancer Classification 18 hours ago | arxiv.org

arxiv attention augmentation cancer +5

Hidden Flaws Behind Expert-Level Accuracy of GPT-4 Vision in Medicine 18 hours ago | arxiv.org

abstract accuracy analysis arxiv +26

A Survey on Autonomous Driving Datasets: Statistics, Annotation Quality, and a Future Outlook 18 hours ago | arxiv.org

abstract advances algorithms annotation +20

Towards Effective Multi-Moving-Camera Tracking: A New Dataset and Lightweight Link Model 18 hours ago | arxiv.org

arxiv cs.cv dataset moving +2

Holodeck: Language Guided Generation of 3D Embodied AI Environments 18 hours ago | arxiv.org

abstract arxiv cs.ai cs.cl +12

Weakly Supervised 3D Object Detection via Multi-Level Visual Guidance 18 hours ago | arxiv.org

3d object 3d object detection arxiv cs.cv +6

Fine-tuning vision foundation model for crack segmentation in civil infrastructures 18 hours ago | arxiv.org

abstract adapter ai models arxiv +15

FG-MDM: Towards Zero-Shot Human Motion Generation via Fine-Grained Descriptions 18 hours ago | arxiv.org

abstract arxiv beyond cs.cv +16

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model 18 hours ago | arxiv.org

abstract adapter arxiv control +20

Data Architect

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

View on ai-jobs.net

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

View on ai-jobs.net

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

View on ai-jobs.net

Social Insights & Data Analyst (Freelance)

@ Media.Monks | Jakarta

View on ai-jobs.net

Cloud Data Engineer

@ Arkatechture | Portland, ME, USA

View on ai-jobs.net