Google Scholar

[PDF][PDF] Model Integration for HMM-and DNN-Based Speech Synthesis Using Product-of-Experts Framework.

K Tachibana, T Toda, Y Shiga, H Kawai - INTERSPEECH, 2016 - isca-archive.org

K Tachibana, T Toda, Y Shiga, H Kawai

INTERSPEECH, 2016•isca-archive.org

Abstract

In this paper, we propose a model integration method for hidden Markov model (HMM) and deep neural network (DNN) based acoustic models using a product-of-experts (PoE) framework in statistical parametric speech synthesis. In speech parameter generation, DNN predicts a mean vector of the probability density function of speech parameters frame by frame while keeping its covariance matrix constant over all frames. On the other hand, HMM predicts the covariance matrix as well as the mean vector but they are fixed within the same HMM state, ie, they can actually vary state by state. To make it possible to predict a better probability density function by leveraging advantages of individual models, the proposed method integrates DNN and HMM as PoE, generating a new probability density function satisfying conditions of both DNN and HMM. Furthermore, we propose a joint optimization method of DNN and HMM within the PoE framework by effectively using additional latent variables. We conducted objective and subjective evaluations, demonstrating that the proposed method significantly outperforms the DNN-based speech synthesis as well as the HMM-based speech synthesis.

isca-archive.org

Show moreShow less

Save Cite Cited by 6 Related articles All 5 versions View as HTML

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

[PDF][PDF] Model Integration for HMM-and DNN-Based Speech Synthesis Using Product-of-Experts Framework.