ダウンロード数: 254
このアイテムのファイル:
ファイル | 記述 | サイズ | フォーマット | |
---|---|---|---|---|
TASLP.2019.2907015.pdf | 4.82 MB | Adobe PDF | 見る/開く |
完全メタデータレコード
DCフィールド | 値 | 言語 |
---|---|---|
dc.contributor.author | Shimada, Kazuki | en |
dc.contributor.author | Bando, Yoshiaki | en |
dc.contributor.author | Mimura, Masato | en |
dc.contributor.author | Itoyama, Katsutoshi | en |
dc.contributor.author | Yoshii, Kazuyoshi | en |
dc.contributor.author | Kawahara, Tatsuya | en |
dc.contributor.alternative | 吉井, 和佳 | ja |
dc.contributor.alternative | 河原, 達也 | ja |
dc.date.accessioned | 2019-04-23T07:47:24Z | - |
dc.date.available | 2019-04-23T07:47:24Z | - |
dc.date.issued | 2019-05 | - |
dc.identifier.issn | 2329-9290 | - |
dc.identifier.issn | 2329-9304 | - |
dc.identifier.uri | http://hdl.handle.net/2433/240994 | - |
dc.description.abstract | This paper describes multichannel speech enhancement for improving automatic speech recognition (ASR) in noisy environments. Recently, the minimum variance distortionless response (MVDR) beamforming has widely been used because it works well if the steering vector of speech and the spatial covariance matrix (SCM) of noise are given. To estimating such spatial information, conventional studies take a supervised approach that classifies each time-frequency (TF) bin into noise or speech by training a deep neural network (DNN). The performance of ASR, however, is degraded in an unknown noisy environment. To solve this problem, we take an unsupervised approach that decomposes each TF bin into the sum of speech and noise by using multichannel nonnegative matrix factorization (MNMF). This enables us to accurately estimate the SCMs of speech and noise not from observed noisy mixtures but from separated speech and noise components. In this paper, we propose online MVDR beamforming by effectively initializing and incrementally updating the parameters of MNMF. Another main contribution is to comprehensively investigate the performances of ASR obtained by various types of spatial filters, i.e., time-invariant and variant versions of MVDR beamformers and those of rank-1 and full-rank multichannel Wiener filters, in combination with MNMF. The experimental results showed that the proposed method outperformed the state-of-the-art DNN-based beamforming method in unknown environments that did not match training data. | en |
dc.format.mimetype | application/pdf | - |
dc.language.iso | eng | - |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | en |
dc.rights | © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | en |
dc.rights | The full-text file will be made open to the public on 25 March 2021 in accordance with publisher's 'Terms and Conditions for Self-Archiving'. | en |
dc.rights | この論文は出版社版でありません。引用の際には出版社版をご確認ご利用ください。 | ja |
dc.rights | This is not the published version. Please cite only the published version. | en |
dc.subject | Noisy speech recognition | en |
dc.subject | speech enhancement | en |
dc.subject | multichannel nonnegative matrix factorization | en |
dc.subject | beamforming | en |
dc.title | Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition | en |
dc.type | journal article | - |
dc.type.niitype | Journal Article | - |
dc.identifier.ncid | AA12669539 | - |
dc.identifier.jtitle | IEEE/ACM Transactions on Audio, Speech, and Language Processing | en |
dc.identifier.volume | 27 | - |
dc.identifier.issue | 5 | - |
dc.identifier.spage | 960 | - |
dc.identifier.epage | 971 | - |
dc.relation.doi | 10.1109/TASLP.2019.2907015 | - |
dc.textversion | author | - |
dc.address | Graduate School of Informatics, Kyoto University | en |
dc.address | Graduate School of Informatics, Kyoto University | en |
dc.address | Graduate School of Informatics, Kyoto University | en |
dc.address | Graduate School of Informatics, Kyoto University | en |
dc.address | Graduate School of Informatics, Kyoto University | en |
dc.address | Graduate School of Informatics, Kyoto University | en |
dcterms.accessRights | open access | - |
datacite.date.available | 2021-03-25 | - |
dc.identifier.pissn | 2329-9290 | - |
dc.identifier.eissn | 2329-9304 | - |
出現コレクション: | 学術雑誌掲載論文等 |
このリポジトリに保管されているアイテムはすべて著作権により保護されています。