Feb. 2, 2024, 3:41 p.m. | Huan Liao Haonan Han Kai Yang Tianjiao Du Rui Yang Zunnan Xu Qinmei Xu Jingquan Liu Ji

cs.CL updates on arXiv.org arxiv.org

With the development of AI-Generated Content (AIGC), text-to-audio models are gaining widespread attention. However, it is challenging for these models to generate audio aligned with human preference due to the inherent information density of natural language and limited model understanding ability. To alleviate this issue, we formulate the BATON, a framework designed to enhance the alignment between generated audio and text prompt using human preference feedback. Our BATON comprises three key stages: Firstly, we curated a dataset containing both prompts …

aigc ai-generated content attention audio cs.cl cs.sd development eess.as feedback framework generate generated human information issue language natural natural language text understanding

AI Research Scientist

@ Vara | Berlin, Germany and Remote

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Senior Machine Learning Engineer

@ Samsara | Canada - Remote