Region-Attentive Multimodal Neural Machine Translation

Zhao, Yuting; Komachi, Mamoru; Kajiwara, Tomoyuki; Chu, Chenhui

このアイテムのアクセス数: 233

http://hdl.handle.net/2433/267428

このアイテムのファイル:

ファイル	記述	サイズ	フォーマット
j.neucom.2021.12.076.pdf		1.84 MB	Adobe PDF	見る/開く

完全メタデータレコード

DCフィールド	値	言語
dc.contributor.author	Zhao, Yuting	en
dc.contributor.author	Komachi, Mamoru	en
dc.contributor.author	Kajiwara, Tomoyuki	en
dc.contributor.author	Chu, Chenhui	en
dc.date.accessioned	2022-01-11T08:06:47Z	-
dc.date.available	2022-01-11T08:06:47Z	-
dc.date.issued	2022-03	-
dc.identifier.uri	http://hdl.handle.net/2433/267428	-
dc.description.abstract	We propose a multimodal neural machine translation (MNMT) method with semantic image regions called region-attentive multimodal neural machine translation (RA-NMT). Existing studies on MNMT have mainly focused on employing global visual features or equally sized grid local visual features extracted by convolutional neural networks (CNNs) to improve translation performance. However, they neglect the effect of semantic information captured inside the visual features. This study utilizes semantic image regions extracted by object detection for MNMT and integrates visual and textual features using two modality-dependent attention mechanisms. The proposed method was implemented and verified on two neural architectures of neural machine translation (NMT): recurrent neural network (RNN) and self-attention network (SAN). Experimental results on different language pairs of Multi30k dataset show that our proposed method improves over baselines and outperforms most of the state-of-the-art MNMT methods. Further analysis demonstrates that the proposed method can achieve better translation performance because of its better visual feature use.	en
dc.language.iso	eng	-
dc.publisher	Elsevier BV	en
dc.rights	© 2022 The Authors. Published by Elsevier B.V.	en
dc.rights	This is an open access article under the Creative Commons Attribution 4.0 International license.	en
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	-
dc.subject	Multimodal neural machine translation	en
dc.subject	Recurrent neural network	en
dc.subject	Self-attention network	en
dc.subject	Object detection	en
dc.subject	Semantic image regions	en
dc.title	Region-Attentive Multimodal Neural Machine Translation	en
dc.type	journal article	-
dc.type.niitype	Journal Article	-
dc.identifier.jtitle	Neurocomputing	en
dc.identifier.volume	476	-
dc.identifier.spage	1	-
dc.identifier.epage	13	-
dc.relation.doi	10.1016/j.neucom.2021.12.076	-
dc.textversion	publisher	-
dcterms.accessRights	open access	-
datacite.awardNumber	19K20343	-
datacite.awardNumber	18H06465	-
datacite.awardNumber.uri	https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-19K20343/	-
datacite.awardNumber.uri	https://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-19K21533/	-
dc.identifier.pissn	0925-2312	-
jpcoar.funderName	日本学術振興会	ja
jpcoar.funderName	日本学術振興会	ja
jpcoar.awardTitle	マルチモーダルデータからの対訳資源の抽出によるニューラル機械翻訳	ja
jpcoar.awardTitle	マルチモーダル品質推定に基づく機械翻訳モデルの高度化	ja
出現コレクション:	学術雑誌掲載論文等

アイテムの簡略レコードを表示する

Export to RefWorks

このアイテムは次のライセンスが設定されています: クリエイティブ・コモンズ・ライセンス