Downloads: 677

Files in This Item:
File Description SizeFormat 
IPSJ-JNL5902009.pdf1.41 MBAdobe PDFView/Open
Title: 古典中国語(漢文)の形態素解析とその応用
Other Titles: Morphological Analysis of Classical Chinese Texts and Its Application
Authors: 安岡, 孝一  kyouindb  KAKEN_id
ウィッテルン, クリスティアン  KAKEN_name
守岡, 知彦  kyouindb  KAKEN_id  orcid (unconfirmed)
池田, 巧  KAKEN_name
山崎, 直樹  KAKEN_name
二階堂, 善弘  KAKEN_name
鈴木, 慎吾  KAKEN_name
師, 茂樹  KAKEN_name
Author's alias: Yasuoka, Koichi
Wittern, Christian
Morioka, Tomohiko
Ikeda, Takumi
Yamazaki, Naoki
Nikaido, Yoshihiro
Suzuki, Shingo
Moro, Shigeki
Keywords: 漢文コーパス
classical Chinese corpus
linked data
named entity extraction
Issue Date: 15-Feb-2018
Publisher: 情報処理学会
Journal title: 情報処理学会論文誌
Volume: 59
Issue: 2
Start page: 323
End page: 331
Abstract: 古典中国語(漢文)の解析手法として, MeCabを用いた形態素解析手法を提案する. 本手法では, 漢文の動賓構造を表現すべく, 4階層の「品詞」からなる新たな品詞体系を構築し, それに基づくMeCab漢文コーパスを設計した. 合わせて, MeCab漢文コーパスを入力するための専用ツールとして, XEmacs CHISEをベースとしたコーパス入力ツールを開発した. また, MeCab漢文コーパスを効果的に管理し, さらには品詞体系のリファクタリングを行うべく, MeCab漢文コーパスのLinked Data化を行い, WWW上で公開した. さらに, MeCabを用いた漢文形態素解析の応用として, 漢文における固有表現の自動抽出に挑戦した. 結果として, 地名の自動抽出は高精度に行うことができたが, 官職・人名の自動抽出はそれぞれに課題が残った.
A method to analyze classical Chinese texts is proposed. In the method, we use our original morphological analyzer based on MeCab. We propose a new four-level word-class system to represent the predicate-object structure of classical Chinese. In order to make a corpus for classical Chinese on MeCab, we have constructed a MeCab-corpus editor based on XEmacs CHISE. In order to control the corpus effectively, and to refactor our four-level word-class system, we have converted it into Linked Data on WWW. As an applied study for our morpholgical analysis of classical Chinese texts, we have tried to extract named entities: names of places, job titles, and names of people. As a result we are able to extract names of places from classical Chinese texts almost perfectly. But we have found some difficulties to extract job titles or names of people.
Rights: The copyright of this material is retained by the Information Processing Society of Japan (IPSJ). This material is published on this web site with the agreement of the author (s) and the IPSJ. Please be complied with Copyright Law of Japan and the Code of Ethics of the IPSJ if any users wish to reproduce, make derivative work, distribute or make available to the public any part or whole thereof. All Rights Reserved, Copyright (C) Information Processing Society of Japan.
Related Link:
Appears in Collections:Journal Articles

Show full item record

Export to RefWorks

Export Format: 

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.