Aug. 29, 2022, 1:10 a.m. | Shrutina Agarwal, Sriram Ganapathy, Naoya Takahashi

cs.LG updates on arXiv.org arxiv.org

In this paper, we propose a model to perform style transfer of speech to
singing voice. Contrary to the previous signal processing-based methods, which
require high-quality singing templates or phoneme synchronization, we explore a
data-driven approach for the problem of converting natural speech to singing
voice. We develop a novel neural network architecture, called SymNet, which
models the alignment of the input speech with the target melody while
preserving the speaker identity and naturalness. The proposed SymNet model is
comprised …

arxiv networks speech style transfer transfer transformer voice

Data Architect

@ University of Texas at Austin | Austin, TX

Data ETL Engineer

@ University of Texas at Austin | Austin, TX

Lead GNSS Data Scientist

@ Lurra Systems | Melbourne

Senior Machine Learning Engineer (MLOps)

@ Promaton | Remote, Europe

Business Intelligence Analyst

@ Rappi | COL-Bogotá

Applied Scientist II

@ Microsoft | Redmond, Washington, United States