Downloads: 263

Files in This Item:
File Description SizeFormat 
2523057.2523059.pdf439.97 kBAdobe PDFView/Open
Title: Chinese-Japanese Machine Translation Exploiting Chinese Characters
Authors: Chu, Chenhui  kyouindb  KAKEN_id  orcid (unconfirmed)
Nakazawa, Toshiaki
Kawahara, Daisuke  kyouindb  KAKEN_id
Kurohashi, Sadao  kyouindb  KAKEN_id
Author's alias: 褚, 晨翚
中澤, 敏明
河原, 大輔
黒橋, 禎夫
Issue Date: Oct-2013
Publisher: Association for Computing Machinery
Journal title: ACM Transactions on Asian Language Information Processing
Volume: 12
Issue: 4
Thesis number: 16
Abstract: The Chinese and Japanese languages share Chinese characters. Since the Chinese characters in Japanese originated from ancient China, many common Chinese characters exist between these two languages. Since Chinese characters contain significant semantic information and common Chinese characters share the same meaning in the two languages, they can be quite useful in Chinese-Japanese machine translation (MT). We therefore propose a method for creating a Chinese character mapping table for Japanese, traditional Chinese, and simplified Chinese, with the aim of constructing a complete resource of common Chinese characters. Furthermore, we point out two main problems in Chinese word segmentation for Chinese-Japanese MT, namely, unknown words and word segmentation granularity, and propose an approach exploiting common Chinese characters to solve these problems. We also propose a statistical method for detecting other semantically equivalent Chinese characters other than the common ones and a method for exploiting shared Chinese characters in phrase alignment. Results of the experiments carried out on a state-of-the-art phrase-based statistical MT system and an example-based MT system show that our proposed approaches can improve MT performance significantly, thereby verifying the effectiveness of shared Chinese characters for Chinese-Japanese MT.
Rights: © 2013 ACM, Inc.
This is not the published version. Please cite only the published version.
DOI(Published Version): 10.1145/2523057.2523059
Appears in Collections:Journal Articles

Show full item record

Export to RefWorks

Export Format: 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.