Exploiting Automatic Speech Recognition Errors to Enhance Partial and Synchronized Caption for Facilitating Second Language Listening

Mirzaei, Maryam Sadat; Meshgi, Kourosh; Kawahara, Tatsuya

このアイテムのアクセス数: 313

http://hdl.handle.net/2433/240843

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
j.csl.2017.11.001.pdf		758.4 kB	Adobe PDF	見る/開く

完全メタデータレコード

DCフィールド	値	言語
dc.contributor.author	Mirzaei, Maryam Sadat	en
dc.contributor.author	Meshgi, Kourosh	en
dc.contributor.author	Kawahara, Tatsuya	en
dc.contributor.alternative	河原, 達也	ja
dc.date.accessioned	2019-04-16T00:41:22Z	-
dc.date.available	2019-04-16T00:41:22Z	-
dc.date.issued	2018-05	-
dc.identifier.issn	0885-2308	-
dc.identifier.uri	http://hdl.handle.net/2433/240843	-
dc.description.abstract	This paper addresses the viability of using Automatic Speech Recognition (ASR) errors as the predictor of difficulties in speech segments, thereby exploiting them to improve Partial and Synchronized Caption (PSC), which we have proposed to train second language (L2) listening skill by encouraging listening over reading. The system uses ASR technology to make word-level text-to-speech synchronization and generates a partial caption. The baseline system determines difficult words based on three features: speech rate, word frequency and specificity. While it encompasses most of the difficult words, it does not cover a wide range of features that hinder L2 listening. Therefore, we propose the use of ASR systems as a model of L2 listeners and hypothesize that ASR errors can predict challenging speech segments for these learners. Among different cases of ASR errors, annotation results suggest the usefulness of four categories of homophones, minimal pairs, negatives, and breached boundaries for L2 listeners. A preliminary experiment with L2 learners focusing on these four categories of the ASR errors revealed that these cases highlight the problematic speech regions for L2 listeners. Based on the findings, the PSC system is enhanced to incorporate these kinds of useful ASR errors. An experiment with L2 learners demonstrated that the enhanced version of PSC is not only preferable, but also more helpful to facilitate the L2 listening process.	en
dc.format.mimetype	application/pdf	-
dc.language.iso	eng	-
dc.publisher	Elsevier BV	en
dc.rights	© 2018. This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/.	en
dc.rights	The full-text file will be made open to the public on 1 May 2020 in accordance with publisher's 'Terms and Conditions for Self-Archiving'.	en
dc.rights	This is not the published version. Please cite only the published version.	en
dc.rights	この論文は出版社版でありません。引用の際には出版社版をご確認ご利用ください。	ja
dc.subject	Computer-assisted language learning	en
dc.subject	Second language listening skill	en
dc.subject	Automatic speech recognition	en
dc.subject	Partial and synchronized caption	en
dc.title	Exploiting Automatic Speech Recognition Errors to Enhance Partial and Synchronized Caption for Facilitating Second Language Listening	en
dc.type	journal article	-
dc.type.niitype	Journal Article	-
dc.identifier.jtitle	Computer Speech & Language	en
dc.identifier.volume	49	-
dc.identifier.spage	17	-
dc.identifier.epage	36	-
dc.relation.doi	10.1016/j.csl.2017.11.001	-
dc.textversion	author	-
dc.address	Graduate School of Informatics, Kyoto University	en
dc.address	Graduate School of Informatics, Kyoto University	en
dc.address	Graduate School of Informatics, Kyoto University	en
dcterms.accessRights	open access	-
datacite.date.available	2020-05-01	-
dc.identifier.pissn	0885-2308	-
dc.identifier.eissn	1095-8363	-
出現コレクション:	学術雑誌掲載論文等

アイテムの簡略レコードを表示する

Export to RefWorks

このリポジトリに保管されているアイテムはすべて著作権により保護されています。