ダウンロード数: 10

このアイテムのファイル:
ファイル 記述 サイズフォーマット 
zenodo.7316742.pdf485.9 kBAdobe PDF見る/開く
タイトル: End-to-End Lyrics Transcription Informed by Pitch and Onset Estimation
著者: Deng, Tengyu
Nakamura, Eita  kyouindb  KAKEN_id  orcid https://orcid.org/0000-0003-4097-6027 (unconfirmed)
Yoshii, Kazuyoshi  kyouindb  KAKEN_id  orcid https://orcid.org/0000-0001-8387-8609 (unconfirmed)
著者名の別形: 鄧, 腾煜
中村, 栄太
吉井, 和佳
キーワード: ismir
ismir2022
発行日: 2022
出版者: ISMIR
誌名: Proceedings of the 23rd International Society for Music Information Retrieval Conference
開始ページ: 633
終了ページ: 639
抄録: This paper presents an automatic lyrics transcription (ALT) method for music recordings that leverages the framewise semitone-level sung pitches estimated in a multi-task learning framework. Compared to automatic speech recognition (ASR), ALT is challenging due to the insufficiency of training data and the variation and contamination of acoustic features caused by singing expressions and accompaniment sounds. The domain adaptation approach has thus recently been taken for updating an ASR model pre-trained from sufficient speech data. In the naive application of the end-to-end approach to ALT, the internal audio-to-lyrics alignment often fails due to the time-stretching nature of singing features. To stabilize the alignment, we make use of the semi-synchronous relationships between notes and characters. Specifically, a convolutional recurrent neural network (CRNN) is used for estimating the semitone-level pitches with note onset times while eliminating the intra- and inter-note pitch variations. This estimate helps an end-to-end ALT model based on connectionist temporal classification (CTC) learn correct audio-to-character alignment and mapping, where the ALT model is trained jointly with the pitch and onset estimation model. The experimental results show the usefulness of the pitch and onset information in ALT.
記述: International Society for Music Information Retrieval Conference (ISMIR 2022) , Bengaluru, India, December 4-8, 2022
著作権等: © T. Deng, E. Nakamura, and K. Yoshii.
Creative Commons Attribution 4.0 International
URI: http://hdl.handle.net/2433/287439
DOI(出版社版): 10.5281/zenodo.7316742
出現コレクション:学術雑誌掲載論文等

アイテムの詳細レコードを表示する

Export to RefWorks


出力フォーマット 


このアイテムは次のライセンスが設定されています: クリエイティブ・コモンズ・ライセンス Creative Commons