ダウンロード数: 682

このアイテムのファイル:
ファイル 記述 サイズフォーマット 
D_Kiyota_Yoji.pdfDissertation_全文734.91 kBAdobe PDF見る/開く
yjohk00135.pdfAbstract_要旨174.34 kBAdobe PDF見る/開く
タイトル: Dialog navigator : A navigation system from vague questions to specific answers based on real-world text collections
その他のタイトル: ダイアログナビ : 実世界テキスト集合に基づく漠然とした質問から具体的な回答へのナビゲーションシステム
著者: Kiyota, Yoji
著者名の別形: 清田, 陽司
発行日: 24-Nov-2004
出版者: 京都大学 (Kyoto University)
抄録: As computers and their networks continue to be developed, our day-to-day lives are being surrounded by increasingly more complex instruments, and we often have to ask questions about using them. At the same time, large collections of texts to answer these questions are being gathered. Therefore, there are potential answers to many of our questions that exist as texts somewhere. However, there are various gaps between our various questions and the texts, and these prevent us from accessing appropriate texts to answer our questions. The gaps are mainly composed of both expression and vagueness gaps. When we seek texts for answers using conventional keyword-based text retrieval systems, we often have trouble locating them. In contrast, when we ask experts on instruments or operators of call centers, they can resolve the various gaps, by interpreting our questions flexibly, and by producing some ask-backs. The problem with experts and call centers is that they are not always available. Two approaches have been studied to resolve the various gaps: the extension of keyword-based text retrieval systems, and the application of artificial intelligence techniques. However, these approaches have their respective limitations. The former uses texts or keywords as methods for ask-back questions, but these methods are not always suitable. The latter requires a specialized knowledge base described in formal languages, so it cannot be applied to existing collections with large amount of texts. This thesis targets real-world the large text collections provided by Microsoft Corporation, and addresses a novel methodology to resolve the gaps between various user questions and the texts. The methodology consists of two key solutions: precise and flexible methods of matching user questions with texts based on NLP (natural language processing) techniques, and ask-back methods using the matching methods. First, the matching methods, including sentence structure analysis and expression gap resolution, are described. In addition, these methods are extended into matching through metonymy, which is frequently observed in natural languages. After that, a solution to make ask backs based on these matching methods, by using two kinds of ask-backs that complement each other, is proposed. Both ask-backs navigate users from vague questions to specific answers. Finally, our methodology is evaluated through the real-world operation of a dialog system, Dialog Navigator, in which all the proposed methods are implemented. Chapter 1 discusses issues on information retrieval, and present which issues are to be solved. That is, it examines the question logs from a real-world natural-language-based text retrieval system, and organizes types and factors of the gaps. The examination indicates that some gaps between user questions and texts cannot be resolved well by methods used in previous studies, and suggests that both interactions with users and applicability to real-world text collections are needed. Based on the discussion, a solution to deal with these gaps is proposed, by advancing an approach employed in open-domain question-answering systems, i.e., utilization of recent NLP techniques, into resolving the various gaps. Chapter 2 proposes several methods of matching user questions with texts, based on the NLP techniques. Of these techniques, sentence structure analysis through fullparsing is essential for two reasons: first, it enables expression gaps to be resolved beyond the keyword level; second, it is indispensable in resolving vagueness gaps by providing ask-backs. Our methods include: sentence structure analysis using a Japanese parser KNP, expression-gap resolution based on two kinds of dictionaries, text-collection selection through question-type estimates, and score calculations based on sentence structures. An experimental evaluation on testsets shows significant improvements of performance by our methods. Chapter 3 proposes a novel method of processing metonymy, as an extension of the matching methods proposed in Chapter 2. Metonymy is a figure of speech in which the name of one thing is substituted for that of something else to which it is related, and this frequently occurs in both user questions and texts. Namely, this chapter addresses the automatic acquisition of pairs of metonymic expressions and their interpretative expressions from large corpora, and applies the acquired pairs to resolving structural gaps caused by metonymy. Unlike previous studies on metonymy, the method targets both recognition and interpretation process of metonymy. The method acquired 1, 126 pairs from corpora, and over 80% of the pairs were correct as interpretations of metonymy. Furthermore, an experimental evaluation on the testsets demonstrated that introducing the acquired pairs significantly improves matching. Chapter 4 presents a strategy of navigating users from vague questions to specific texts based on the previously discussed matching methods. Of course, it is necessary to make some use of ask-backs to achieve this, and this strategy involves two approaches: description extraction as a bottom-up approach, and dialog cards as a top-down approach. The former extracts the neighborhoods of the part that matches the user question in each text through matching methods. Such neighborhoods are mostly suitable for ask-backs that clarify vague user questions. However, if a user’s question is too vague, this approach often fails. The latter covers vague questions based on the know-how of the call center; dialog cards systematize procedures for ask-backs to clarify frequently asked questions that are vague. Matching methods are also applied to match user questions with the cards. Finally, a comparison of the approaches with those used in other related work demonstrates the novelty of the approaches. Chapter 5 describes the architecture for Dialog Navigator, a dialog system in which all the proposed methods are implemented. The system uses the real-world large text collections provided by Microsoft Corporation, and it has been open to the public on a website from April 2002. The methods were evaluated based on the real-world operational results of the system, because the various gaps to be resolved should reflect those in the real-world. The evaluation proved the effectiveness of the methods: more than 70% of all user questions were answered with relevant texts, the behaviors of both users and the system were reasonable with most dialogs, and most of the extracted descriptions for ask-backs were suitably matched. Chapter 6 concludes the thesis.
学位授与大学: 京都大学
学位の種類: 新制・課程博士
取得分野: 博士(情報学)
報告番号: 甲第11209号
学位記番号: 情博第135号
学位授与年月日: 2004-11-24
請求記号: 新制||情||31(附属図書館)
研究科・専攻: 京都大学大学院情報学研究科知能情報学専攻
論文調査委員: (主査)教授 松山 隆司, 教授 河原 達也, 助教授 佐藤 理史
学位授与の要件: 学位規則第4条第1項該当
DOI: 10.14989/doctor.k11209
URI: http://hdl.handle.net/2433/84999
出現コレクション:140 博士(情報学)

アイテムの詳細レコードを表示する

Export to RefWorks


出力フォーマット 


このリポジトリに保管されているアイテムはすべて著作権により保護されています。