June 7, 2024, 4:52 a.m. | Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha

cs.CL updates on arXiv.org arxiv.org

arXiv:2309.09836v2 Announce Type: replace-cross
Abstract: We present RECAP (REtrieval-Augmented Audio CAPtioning), a novel and effective audio captioning system that generates captions conditioned on an input audio and other captions similar to the audio retrieved from a datastore. Additionally, our proposed method can transfer to any domain without the need for any additional fine-tuning. To generate a caption for an audio sample, we leverage an audio-text model CLAP to retrieve captions similar to it from a replaceable datastore, which are then …

abstract arxiv audio captioning captions cs.ai cs.cl cs.sd domain eess.as fine-tuning generate input novel recap replace retrieval retrieval-augmented transfer type

Senior Data Engineer

@ Displate | Warsaw

Automation and AI Strategist (Remote - US)

@ MSD | USA - New Jersey - Rahway

Assistant Manager - Prognostics Development

@ Bosch Group | Bengaluru, India

Analytics Engineer - Data Solutions

@ MSD | IND - Maharashtra - Pune (Wework)

Jr. Data Engineer (temporary)

@ MSD | COL - Cundinamarca - Bogotá (Colpatria)

Senior Data Engineer

@ KION Group | Atlanta, GA, United States