Studio quality speech enhancement with estimating stochastic interpolants

Affiliation

MARG

Presenter

박재현

Personal Link

Subject

Speech Enhancement

Site

Time

Poster Session II - 13:30~15:00

1 more property

Abstract

The goal of studio-quality speech enhancement is to improve the quality of degraded speech and singing signals. In previous studies, researchers attempted to address this issue by employing a conditional diffusion model. However, this model's stochastic process aligns with the prior distribution only in infinite-time scenarios, thereby offering only approximate solutions. Furthermore, this approach necessitates the selection of suitable hyperparameters for defining the stochastic differential equation (SDE) and training the model. To overcome these limitations, we propose a more versatile framework utilizing stochastic interpolants.