Title: 利用蛋白質結構預測蛋白質內的重要功能位置
Prediction of functional sites of proteins from protein structures
Authors: 于松桓
Sung-Huan Yu
黃鎮剛
Jenn-Kang Hwang
生物資訊及系統生物研究所
Keywords: 活化基;蛋白質接觸點數;蛋白質中心;熱擾動;金屬鍵結殘基;螯合物;Catalytic residues;protein contact number;protein centroid;thermal fluctuations;metal binding residues;chelate
Issue Date: 2007
Abstract: 第一章-活性位置(active site) 由於現在結構基因組學(structural genomics)的研究以驚人的速度發展,相當多的蛋白質結構已被解出並存放於蛋白質資料銀行(Protein Data Bank - PDB)這個資料庫中。因著前面說到的情形,逐漸出現許多不知道功能的蛋白質,而發展利用蛋白質結構直接預測蛋白質內活性位置的方法也變得日漸重要。有許多特性與蛋白質的活性位置有關聯,例如:越密集的區域(higher packing density)、越靠近蛋白質幾何中心(structural centrality)、熱擾動(thermal fluctuations)越低的殘基(residues),越有可能是活性位置,根據這些特性我們發展出一個簡單的方法來預測蛋白質的活性位置。若我們給予這些方法所計算出來的結果一個合適的闕值(threshold),我們可以在760個非同源性酵素(nonhomologous enzyme)中預測到76%的活性位置,並且只有27%的假陽性(false positive)。倘若我們加入蛋白質序列(sequence)的資訊,用此資訊來加權原來的資料,可以預測到80%活性位置,只有20%的假陽性。我們的方法不需要序列或結構的比對(alignment),或利用結構模版庫(structural template library),此方法也避免了繁雜的溶劑表面易溶性(solvent accessible surface)和分子力學(molecular mechanical)的計算。 我們相信我們的方法會是一個預測蛋白質活性位置相當有用的方法,並且比其他的方法還要完整。 第二章-金屬離子鍵結位置(metal binding site) 金屬離子在生物體中扮演相當重要的角色,例如:幫助酵素催化、調節生物體內機能、提高結構穩定性等。由於目前蛋白質結構快速增加的現代,預測蛋白質內金屬離子的鍵結位置也就日趨重要。我們知道若是要讓金屬離子穩定的存在在蛋白質中,必須產生螯合物(chelate)。而要形成螯合物其中有一個因素非常重要,就是金屬離子周圍必須有足夠的原子與它產生配位(coordinate)。這個特性非常類似我們第一章提到的依賴距離之接觸點數(distance-dependent protein contact-number簡稱CN)的模型,即指明若有許多能夠與金屬離子反應的原子在一個殘基的周圍,此殘基就極有可能是金屬離子鍵結位置。一般來說,會與金屬離子產生螯合物的原子為-氮(N)、硫(S)、氧(O)。根據這個想法,我們利用CN模型的想法,但是將C□換成像氮(N)、硫(S)、氧(O)的原子,用此方法來預測金屬離子鍵結位置。此方法可以在Sodhi的資料組中正確預測72.4%鈣離子、94.7%銅離子、86.5%鐵離子、77.6%鎂離子、88.5%錳離子和91.5%鋅離子的鍵結位置。
Chapter 1 – active site Due to the tremendous advances in structural genomics research, an incredible number of protein structures has been solved and deposited in PDB. As a result, the number of structures with unknown function also climbs up accordingly. It becomes increasingly important that one can predict functional sites directly from protein structures. Based on the distinct properties associated with the active-site residues such as higher packing density, proximity to structural centrality and smaller thermal fluctuations, we developed a simple method for detection of the active sites of enzymes to compute profiles based on the aforementioned properties. Using proper threshold values for the profiles, we are able to detect up to 76% of catalytic residues with 27% of false positives for a data set comprising 760 nonhomologous enzymes. If additional sequence information is included, the sequence-weighed profile method can be improved to detect 80% of catalytic residues with 20% of false positives. Our method does not require sequence or structural alignment, or a structural template library, and it avoids solvent accessible surface or molecular mechanical calculations. We believe that our method will be a useful tool for detection of possible active sites from protein structures to complement other existing methods. Chapter 2 – metal binding site Metal ions are crucial role in organisms. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. In this era, there is incredible number of protein structures solved. So, the importance of predicting metal binding site is increased. We all know that if there are metal ions stable existed in protein, the metal ions should form chelate. One of the important factors to form chelate is there should be enough atoms to coordinate with metal ion. The characteristic is very similar as distance-dependent protein contact-number model (CN) that we introduced in chapter 1. This means that if there are more atoms that are high probability to interact with metal ion around the residue, that would be probably metal binding residue. In general, the atoms that have high probability to interact with metal are such as N, S, O. Base on the thought, we follow the aspect of CN but use the atoms, like N, S, O, to replaced C□ to predict metal binding residues. This method can detect Ca – 72.4%, Cu – 94.7%, Fe – 86.5%, Mg – 77.6%, Mn – 88.5%, and Zn – 91.5% in Sodhi’s dataset.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009551501
http://hdl.handle.net/11536/39429
Appears in Collections:Thesis


Files in This Item:

  1. 150101.pdf