A Fast Speaker Adaptation Approach for Large Vocabulary Continuous Speech Recognition

Yaohui Qi, Jiaming Liu, Yanrong Ge, Hongxia Bu, Li Wang

Abstract


Acoustic model adaptation is an efficient way to compensate the acoustic mismatch due to speaker variability. We propose a linear projection (LP) based on transformation matrix speaker adaptation approach to reduce the amount of the required adaptation data. The method adopts LP function to transform multiple sets of speaker adapted (SA) acoustic models into a new single set of models. To save storage space, the transformation matrix obtained by maximum likelihood linear regression (MLLR) is used to represent the specific speaker information. Maximum likelihood (ML) criterion is used to select the SA models with the most important information from the candidate models in order to reduce the amount of transformation parameters. Supervised and unsupervised speaker adaptation experiments are carried out on a large vocabulary continuous speech recognition task to evaluate its effectiveness. Experimental results show that the proposed approach can realize fast adaptation.

Keywords


speech recognition; speaker adaptation; MLLR; LP


DOI
10.12783/dtetr/iceta2016/6975

Refbacks

  • There are currently no refbacks.