標題: 問答系統之查詢擴充-以網路資源為本的非監督式學習策略
Web-Based Unsupervised Learning to Query Formulation for Question Answering
作者: 王怡嘉
Yi-Chia Wang
梁婷
張俊盛
Tyne Liang
Jason S. Chang
資訊科學與工程研究所
關鍵字: 問答系統;問句類型分析;查詢詞擴充;question answering;question type extraction;query expansion
公開日期: 2004
摘要: 本論文提出一個專為問答系統所設計的學習機制,其以網路資源為本並著重於擴充查詢詞的探討。在分析完使用者所輸入的自然語言問句並辨識出其對應的問句樣式(question pattern)之後,我們根據學習的結果將問句樣式轉換成一組可增進搜尋引擎查詢效能的查詢詞。 在訓練過程中,我們首先從網路上自動收集所需的訓練資料,依據語言學的知識及統計學上的技術從問句訓練資料中抽取出可清楚表達答案類別的問句樣式,最後再透過問句與答案段落之間的關聯性統計以計算出最適合每個問句樣式的擴充查詢詞組。在執行階段,輸入的問句之問句樣式會被轉換成其對應的擴充查詢詞,用以增加搜尋引擎擷取出答案的機會。 我們將提出的想法實作成程式雛形。獨立於訓練資料的實驗結果的確證明我們的方法表現得比一般關鍵詞查詢法好,並且可以明顯地減少使用者尋找答案時所需瀏覽的文章篇數。總而言之,本文針對問答系統中最關鍵的步驟─查詢詞擴充─提出了一個有效且簡單的解決方法。
This thesis investigates ways of learning how to formulate and expand a query to find the answer on the Web for a given natural language question. In our approach, the question pattern extracted from a given question is transformed into a set of query terms to improve the performance of an underlying search engine. In the training phase, the method involves crawling the Web for passages relevant to many pairs of question and answer, extracting of question patterns for fine-grained answer classification based on linguistic and statistical information, and aligning question patterns and keywords with n-grams in the answer passages. At runtime, any given question is converted into a question pattern which is then transformed to their top-ranking alignment counterparts as a way of formulating an expanded query so as to increase the possibilities of retrieve passages containing the answers. We also describe Atlas (Automatic Transform Learning by Aligning Sentences of question and answer), a prototype implementation of the proposed method. Independent evaluation on a set of questions shows that Atlas performs better than a naive keyword-based approach. This method also obviously reduces the human effort of seeking answers, since our system has higher recall rates when a handful of summaries are examined. Our straightforward method improves the most critical stage in question answering systems and also sheds new light on the long-standing problems of query expansion and relevance feedback.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009223570
http://hdl.handle.net/11536/76620
Appears in Collections:Thesis


Files in This Item:

  1. 357001.pdf