A Study of Knowledge Representation and Transformation for Case-based Expert Systems
|關鍵字:||案例式推論;資料探勘;知識表示法;分群;專家系統;Case-based Reasoning;Data Mining;Knowledge Representation;Clustering;Expert Systems|
為了驗證整個發展程序的效能，我們在資料型態的轉換上以電子郵件的應用為例，經由第一階段資料型態的轉換，使用者可以只依據問題的特性選擇資料探勘法，而不需考慮轉換的問題；在分群程序的驗證上，我們分別以鳶尾花資料（IRIS）、甘蔗育種資料（Sugar-cane breeding data）、電子郵件記錄檔（E-mail log）等資料實驗，結果顯示二階段的分群程序比傳統方法更能有效地找出異常群集；最後，我們建立了一個人事法規的諮詢系統，經由與傳統方式比較，使用者確實可以較容易的查詢到適當的案例。|
Nowadays, expert systems are useful in business and industrial environments due to a variety of applications. A case-based expert system (CBES) representing knowledge base by cases is a kind of expert systems. However, how to construct the case base such that user can effectively retrieve approximate case is still a bottleneck in building CBES. In this work, we first propose a new knowledge representation, named Structural Cases (SC), to describe the relationships among all the cases. As soon as the case base is constructed, the hierarchy of all cases is also constructed. Rather than the traditional flat structure in case base, the cases’ hierarchy may enhance the efficiency of case retrieval. Based upon the cases’ hierarchy, some algorithms are proposed to find the most similar case for any arbitrary case. According to the proposed algorithms, the finite automaton is used to illustrate the efficiency of the algorithms. Moreover, the algorithms for interaction process based on the SC are proposed to retrieve more suitable results. Based upon the proposed representation and algorithms, a CBES developing process including case base construction, case retrieval, and case adaptation is also proposed. In the case base construction, a two-phase data type transformation framework including merging and transforming phases is first proposed. The preprocessing work of data types transformation is often tedious or complex since a lot of data types exist in real world. With the two-phase data type transformation framework, since the preprocessing work is finished in the first phase, users only need to determine which kinds of mining algorithms will be used in the following phase. After the raw data have been transformed into the suitable data types, a two-phase clustering-based approach has been developed to construct the structural cases in this work. In the clustering step, we want to find out the outlier cases before constructing case base in order to reduce the influence of the outlier cases. Our idea is first to partition the data points into several clusters each of which may be all outliers or all non-outliers. After partitioning the data points, it can be easily seen that the time complexity for finding the outliers clusters may be reduced. To verify the practicability and performance of our CBES developing process, some experiments have been done. First, in the data types transformation process, an e-mail management system has been implemented to help users to manage the e-mails by finding out the rules about their interests. The experimental results show the data types transformation process is practicable. Second, three different experimental data including two-dimensional data from Iris flower data, four-dimensional sugar-cane breeding data set, and E-mail log data, are used to compare our two-phase clustering process with traditional clustering algorithms. All the experimental results show that our method generally works better than traditional clustering algorithms. Finally, an application system for Taiwan personnel regulations has been easily developed based upon the proposed representation and algorithms. Comparing the accuracy for retrieval results with and without our system, we found that the retrieval results using our system are better than traditional approaches and the query process of users are simplified.
|Appears in Collections:||Thesis|