Downloads: 236

Files in This Item:
File Description SizeFormat 
TASLP.2019.2955858.pdf1.14 MBAdobe PDFView/Open
Title: Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis
Authors: Duan, Richeng
Kawahara, Tatsuya  kyouindb  KAKEN_id  orcid (unconfirmed)
Dantsuji, Masatake  kyouindb  KAKEN_id
Nanjo, Hiroaki  kyouindb  KAKEN_id  orcid (unconfirmed)
Author's alias: 河原, 達也
壇辻, 正剛
南條, 浩輝
Keywords: Speech and Hearing
Media Technology
Linguistics and Language
Signal Processing
Acoustics and Ultrasonics
Electrical and Electronic Engineering
Issue Date: 2020
Publisher: Institute of Electrical and Electronics Engineers (IEEE)
Journal title: IEEE/ACM Transactions on Audio, Speech, and Language Processing
Volume: 28
Start page: 391
End page: 401
Abstract: In computer-assisted pronunciation training (CAPT), the scarcity of large-scale non-native corpora and human expert annotations are two fundamental challenges to non-native acoustic modeling. Most existing approaches of acoustic modeling in CAPT are based on non-native corpora while there are so many living languages in the world. It is impractical to collect and annotate every non-native speech corpus considering different language pairs. In this work, we address non-native acoustic modeling (both on phonetic and articulatory level) based on transfer learning. In order to effectively train acoustic models of non-native speech without using such data, we propose to exploit two large native speech corpora of learner's native language (L1) and target language (L2) to model cross-lingual phenomena. This kind of transfer learning can provide a better feature representation of non-native speech. Experimental evaluations are carried out for Japanese speakers learning English. We first demonstrate the proposed acoustic-phone model achieves a lower word error rate in non-native speech recognition. It also improves the pronunciation error detection based on goodness of pronunciation (GOP) score. For diagnosis of pronunciation errors, the proposed acoustic-articulatory modeling method is effective for providing detailed feedback at the articulation level.
Rights: © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
This is not the published version. Please cite only the published version.
DOI(Published Version): 10.1109/TASLP.2019.2955858
Appears in Collections:Journal Articles

Show full item record

Export to RefWorks

Export Format: 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.