Improved Speaker Markov Modelling for Unsupervised Speaker Normalization

Fung, Pascale; Kuwahara, Tatsuya; Doshita, Shuji; Adda, Martine

このアイテムのアクセス数: 277

http://hdl.handle.net/2433/52475

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
soa025_049.pdf		611.85 kB	Adobe PDF	見る/開く

完全メタデータレコード

DCフィールド	値	言語
dc.contributor.author	Fung, Pascale	en
dc.contributor.author	Kuwahara, Tatsuya	en
dc.contributor.author	Doshita, Shuji	en
dc.contributor.author	Adda, Martine	en
dc.contributor.alternative	カワハラ, タツヤ	ja
dc.contributor.alternative	ドウシタ, シュウジ	ja
dc.contributor.transcription	カワハラ, タツヤ	ja-Kana
dc.contributor.transcription	ドウシタ, シュウジ	ja-Kana
dc.date.accessioned	2008-04-22T05:23:38Z	-
dc.date.available	2008-04-22T05:23:38Z	-
dc.date.issued	1991	-
dc.identifier.issn	0300-1067	-
dc.identifier.uri	http://hdl.handle.net/2433/52475	-
dc.description.abstract	We propose new methods of improved speech recognition with speaker-variable information. Hidden Markov Model-based recognizers which are trained by reference speaker(s) (RS) are normalized by our two different approaches to give a better speaker-independent recognition rate. Our normalization methods are based on the same principle of inter-speaker Markov mapping. This mapping gives inter-speaker parameters which are used differently in our two approaches. The first Speaker Markov Model Converter (SMMC) converts new speaker spectral data into label data similar to that of the reference speaker utterance, which is passed directly to the recognizer. In the second Integrated Markov Model (IMM) approach, inter-speaker emission probabilities (ISE) are integrated as weights to the HMM emission probabilities. The recognizer in this case is modified according to interspeaker variable information whereas the normalization is done in context. The inter-speaker mapping in both cases are unsupervised to save new speaker (NS) effort. HMM score thresholding, template matching and DP thresholding techniques are applied to select suitable data for unsupervised mapping of NS and RS data. This mapping is done in parallel to the recognition process. Iterations are performed to improve the unsupervised mapping.	en
dc.language.iso	eng	-
dc.publisher	INSTITUTION FOR PHONETIC SCIENCES UNIVERSITY OF KYOTO	en
dc.subject.ndc	801.1	-
dc.title	Improved Speaker Markov Modelling for Unsupervised Speaker Normalization	en
dc.type	departmental bulletin paper	-
dc.type.niitype	Departmental Bulletin Paper	-
dc.identifier.ncid	AN00034779	-
dc.identifier.jtitle	音声科学研究	ja
dc.identifier.volume	25	-
dc.identifier.spage	49	-
dc.identifier.epage	58	-
dc.textversion	publisher	-
dc.sortkey	08	-
dcterms.accessRights	open access	-
dc.identifier.pissn	0300-1067	-
dc.identifier.jtitle-alternative	Studia phonologica	en
出現コレクション:	Vol.25

アイテムの簡略レコードを表示する

Export to RefWorks

このリポジトリに保管されているアイテムはすべて著作権により保護されています。