このアイテムのアクセス数: 179

このアイテムのファイル:
ファイル 記述 サイズフォーマット 
j.cviu.2021.103333.pdf2.01 MBAdobe PDF見る/開く
完全メタデータレコード
DCフィールド言語
dc.contributor.authorChu, Chenhuien
dc.contributor.authorOliveira, Viniciusen
dc.contributor.authorVirgo, Giovanni, Felixen
dc.contributor.authorOtani, Mayuen
dc.contributor.authorGarcia, Noaen
dc.contributor.authorNakashima, Yutaen
dc.date.accessioned2021-12-23T09:37:34Z-
dc.date.available2021-12-23T09:37:34Z-
dc.date.issued2022-01-
dc.identifier.urihttp://hdl.handle.net/2433/266704-
dc.description.abstractVisually grounded paraphrases (VGPs) are different phrasal expressions describing the same visual concept in an image. Previous studies treat VGP identification as a binary classification task, which ignores various phenomena behind VGPs (i.e., different linguistic interpretation of the same visual concept) such as linguistic paraphrases and VGPs from different aspects. In this paper, we propose semantic typology for VGPs, aiming to elucidate the VGP phenomena and deepen the understanding about how human beings interpret vision with language. We construct a large VGP dataset that annotates the class to which each VGP pair belongs according to our typology. In addition, we present a classification model that fuses language and visual features for VGP classification on our dataset. Experiments indicate that joint language and vision representation learning is important for VGP classification. We further demonstrate that our VGP typology can boost the performance of visually grounded textual entailment.en
dc.language.isoeng-
dc.publisherElsevieren
dc.rights© 2021 The Author(s). Published by Elsevier Inc.en
dc.rightsThis is an open access article under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International license.en
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/-
dc.subjectVision and languageen
dc.subjectImage interpretationen
dc.subjectVisual grounded paraphrasesen
dc.subjectSemantic typologyen
dc.subjectDataseten
dc.titleThe Semantic Typology of Visually Grounded Paraphrasesen
dc.typejournal article-
dc.type.niitypeJournal Article-
dc.identifier.jtitleComputer Vision and Image Understandingen
dc.identifier.volume215-
dc.relation.doi10.1016/j.cviu.2021.103333-
dc.textversionpublisher-
dc.identifier.artnum103333-
dcterms.accessRightsopen access-
datacite.awardNumber18H03264-
datacite.awardNumber.urihttps://kaken.nii.ac.jp/grant/KAKENHI-PROJECT-18H03264/-
dc.identifier.pissn1077-3142-
jpcoar.funderName日本学術振興会ja
jpcoar.awardTitle知識ベースを活用した視覚情報に関する質疑応答システムの実現ja
出現コレクション:学術雑誌掲載論文等

アイテムの簡略レコードを表示する

Export to RefWorks


出力フォーマット 


このアイテムは次のライセンスが設定されています: クリエイティブ・コモンズ・ライセンス Creative Commons