[PDF][PDF] Model Integration for HMM-and DNN-Based Speech Synthesis Using Product-of-Experts Framework.

K Tachibana, T Toda, Y Shiga, H Kawai - INTERSPEECH, 2016 - isca-archive.org
K Tachibana, T Toda, Y Shiga, H Kawai
INTERSPEECH, 2016isca-archive.org
In this paper, we propose a model integration method for hidden Markov model (HMM) and
deep neural network (DNN) based acoustic models using a product-of-experts (PoE)
framework in statistical parametric speech synthesis. In speech parameter generation, DNN
predicts a mean vector of the probability density function of speech parameters frame by
frame while keeping its covariance matrix constant over all frames. On the other hand, HMM
predicts the covariance matrix as well as the mean vector but they are fixed within the same …
Abstract
In this paper, we propose a model integration method for hidden Markov model (HMM) and deep neural network (DNN) based acoustic models using a product-of-experts (PoE) framework in statistical parametric speech synthesis. In speech parameter generation, DNN predicts a mean vector of the probability density function of speech parameters frame by frame while keeping its covariance matrix constant over all frames. On the other hand, HMM predicts the covariance matrix as well as the mean vector but they are fixed within the same HMM state, ie, they can actually vary state by state. To make it possible to predict a better probability density function by leveraging advantages of individual models, the proposed method integrates DNN and HMM as PoE, generating a new probability density function satisfying conditions of both DNN and HMM. Furthermore, we propose a joint optimization method of DNN and HMM within the PoE framework by effectively using additional latent variables. We conducted objective and subjective evaluations, demonstrating that the proposed method significantly outperforms the DNN-based speech synthesis as well as the HMM-based speech synthesis.
isca-archive.org
Showing the best result for this search. See all results