Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis

Duan, Richeng; Kawahara, Tatsuya; Dantsuji, Masatake; Nanjo, Hiroaki

ダウンロード数: 513

http://hdl.handle.net/2433/246413

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
TASLP.2019.2955858.pdf		1.14 MB	Adobe PDF	見る/開く

完全メタデータレコード

DCフィールド	値	言語
dc.contributor.author	Duan, Richeng	en
dc.contributor.author	Kawahara, Tatsuya	en
dc.contributor.author	Dantsuji, Masatake	en
dc.contributor.author	Nanjo, Hiroaki	en
dc.contributor.alternative	河原, 達也	ja
dc.contributor.alternative	壇辻, 正剛	ja
dc.contributor.alternative	南條, 浩輝	ja
dc.date.accessioned	2020-03-24T07:26:34Z	-
dc.date.available	2020-03-24T07:26:34Z	-
dc.date.issued	2020	-
dc.identifier.issn	2329-9290	-
dc.identifier.uri	http://hdl.handle.net/2433/246413	-
dc.description.abstract	In computer-assisted pronunciation training (CAPT), the scarcity of large-scale non-native corpora and human expert annotations are two fundamental challenges to non-native acoustic modeling. Most existing approaches of acoustic modeling in CAPT are based on non-native corpora while there are so many living languages in the world. It is impractical to collect and annotate every non-native speech corpus considering different language pairs. In this work, we address non-native acoustic modeling (both on phonetic and articulatory level) based on transfer learning. In order to effectively train acoustic models of non-native speech without using such data, we propose to exploit two large native speech corpora of learner's native language (L1) and target language (L2) to model cross-lingual phenomena. This kind of transfer learning can provide a better feature representation of non-native speech. Experimental evaluations are carried out for Japanese speakers learning English. We first demonstrate the proposed acoustic-phone model achieves a lower word error rate in non-native speech recognition. It also improves the pronunciation error detection based on goodness of pronunciation (GOP) score. For diagnosis of pronunciation errors, the proposed acoustic-articulatory modeling method is effective for providing detailed feedback at the articulation level.	en
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	-
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)	en
dc.rights	© 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en
dc.rights	This is not the published version. Please cite only the published version.	en
dc.rights	この論文は出版社版でありません。引用の際には出版社版をご確認ご利用ください。	ja
dc.subject	Speech and Hearing	en
dc.subject	Media Technology	en
dc.subject	Linguistics and Language	en
dc.subject	Signal Processing	en
dc.subject	Acoustics and Ultrasonics	en
dc.subject	Instrumentation	en
dc.subject	Electrical and Electronic Engineering	en
dc.title	Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis	en
dc.type	journal article	-
dc.type.niitype	Journal Article	-
dc.identifier.jtitle	IEEE/ACM Transactions on Audio, Speech, and Language Processing	en
dc.identifier.volume	28	-
dc.identifier.spage	391	-
dc.identifier.epage	401	-
dc.relation.doi	10.1109/TASLP.2019.2955858	-
dc.textversion	author	-
dc.address	Graduate School of Informatics, Kyoto University	en
dc.address	Graduate School of Informatics, Kyoto University	en
dc.address	Academic Center for Computing and Media Studies, Kyoto University	en
dc.address	Academic Center for Computing and Media Studies, Kyoto University	en
dcterms.accessRights	open access	-
出現コレクション:	学術雑誌掲載論文等

アイテムの簡略レコードを表示する

Export to RefWorks

このリポジトリに保管されているアイテムはすべて著作権により保護されています。