標題: 蛋白質二級結構的規則性及其應用
Finding Protein Secondary Structure Regularity and Related Applications
作者: 朱彥煒
Yen-Wei Chu
孫春在
楊進木
Chuen-Tsai Sun
Jinn-Moon Yang
資訊科學與工程研究所
關鍵字: 蛋白質二級結構;基因演算法;分群法;資料探堪;蛋白質二級結構預測;Protein Secondary Structure;Genetic Algorithms;Clusterting;Data Mining;Protein Secondary Structure Prediction
公開日期: 2005
摘要: 本論文從序列的觀點,討論蛋白質二級結構的規則性。我們定義一個模式以表示蛋白質二級結構的規則性,並提出一個分群-穩態基因演算法來找尋符合蛋白質二級結構規則性的模式。在方法的驗證上,針對所提的演算法利用消去研究法則,証明加入分群的概念是有效果的;並與資料探勘上常用的關聯規則和決策樹等方法做比較,本方法的確在蛋白質資料集中,有相對優異的表現。在應用上,我們分析PSIPRED和PROF這二種方法在做蛋白質二級結構的預測時,有某些區域都是無法猜對的,但利用我們所提出來的模式,可改進此區域約40% 至60% 左右的預測正確率。進一步,我們將所找到的二級結構模式結合PSIPRED和PROF的預測結果,可改進目前二級結構的預測。此外,我們亦以此實驗提出一個教案,以符合問題導向式的學習在生物資訊課程上的教學。
The author explores protein secondary structure regularity from the perspective of sequences. Regularity is defined in terms of a schema discovered by a cluster-based genetic algorithm. Two steps taken to validate the algorithm were a) finding the weightiness of cluster and b) comparing the approach with data mining methods. Schemata were used to address secondary structure predictions for residues that PSIPRED and PROF could not predict. The results indicate that the proposed schemata can improve prediction accuracy for these residues by approximately 40% and 60% for the CB513 and RS126 data sets, respectively. Furthermore, schemata combine the prediction results of PSIPRED and PROF to improve secondary structure prediction. A bioinformatics teaching plan using a problem-based approach is discussed.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT008723812
http://hdl.handle.net/11536/47778
Appears in Collections:Thesis