標題: 利用蛋白質結構中的功能區域交互作用推測蛋白質交互作用
Inferring Protein-protein Interactions from Structural Domain-domain Interactions
作者: 陳宏助
楊進木
Jinn-Moon Yang
生物資訊及系統生物研究所
關鍵字: 蛋白質-蛋白質交互作用;功能區域-功能區域交互作用;protein-protein interaction;domain-domain interaction
公開日期: 2004
摘要: 功能區域-功能區域之間的交互關係對於研究、預測以及註解蛋白質-蛋白質交互作用是很有幫助的。然而目前大規模透過實驗證實的的功能區域-功能區域交互作用還無法產生。在本論文中我們發展了一套新的方法從已知三級結構的蛋白質聚合物中,計算粹取功能區域-功能區域交互作用以及利用其來預測蛋白質-蛋白質交互作用。我們總共得到1008對的功能區域-功能區域交互作用,我們認為還有眾多的蛋白質-蛋白質交互作用是透過這些已知的功能區域-功能區域交互作用所產生。在不同的分數條件下,利用這些功能區域-功能區域交互作用,我們的方法總共預測了超過101483對的蛋白質-蛋白質交互作用。 我們建構了一個叫做「DAPID」的蛋白質-蛋白質交互作用資料庫,DAPID收錄了結構上的功能區域-功能區域交互作用跟我們利用這些功能區域所預測出來的蛋白質-蛋白質交互作用以及目前DIP所有的蛋白質資料。DAPID包含了從1008對的功能區域-功能區域交互作用所預測出來的101483對的蛋白質-蛋白質交互作用(72﹪),1241對在PDB資料庫中已知的蛋白質-蛋白質交互作用(0.8﹪)以及DIP資料庫中所紀錄的38131對的蛋白質-蛋白質交互作用(27﹪)。DAPID主要包含了8種物種的蛋白質-蛋白質交互作用,包括Homo sapiens、Mus musculus、Rattus norvegicus、Drosophila melanogaster、Caenorhabditis elegans、Saccharomyces cerevisiae、Helicobacter pylori以及Escherichia coli。我們的結果顯示我們的計分函式的值與TP/FP比值跟基因表現的相關係數都呈現高度正相關。將我們的計分函式的分數門檻設定在0.5時,所預測出來的蛋白質-蛋白質配對在基因表現相關係數上與Jansen等所產生的非交互作用蛋白質配對或DIP中的S. cerevisiae蛋白質做基因表現的相關係數的比較,我們預測的蛋白質-蛋白質配對在基因表現相關係數上都較為顯著。
Domain-domain interactions can be useful for validating, annotating, and predicting protein-protein interactions. Currently, the large-scale experimentally determined domain-domain interactions do not exist. In this thesis, we have developed an approach to computationally derive protein-protein interactions and domain-domain interactions from 3D protein complexes. We obtained 1008 interacting domain-domain from Protein Data Bank (PDB) and considered many protein pairs may also interact by the same interacting domain pairs. Our method predicted over 101483 protein-protein interactions based on interacting domain pairs and different thresholds of our new scoring function. We have developed a domain-annotated protein-protein interaction database, termed DAPID, based on these inferred protein-protein interactions and experimental database, Database of Interacting Protein (DIP). The DAPID includes 101483 protein-protein interactions (72%) derived from 1008 structural domain pairs from Protein Data Bank (PDB), 1241 interactions (0.8%) directly obtained from 3D complexes in PDB, and 38131 interactions (27%) summarized from Database of Interacting Protein (DIP). The DAPID has eight common animal models, including Homo sapiens, Mus musculus, Rattus norvegicus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Helicobacter pylori and, Escherichia coli. Experimental results show that the gene-expression profiles (and the ratio of true to false positives (TP/FP) are highly correlated to the values of our scoring function. At the same time, our predicting protein-interaction pairs, whose scores greater than 0.5 have higher probability to co-express than the ones of non-interacting set (defined in Jansen et al. , Science 2003) or DIP set in S. cerevisiae.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009251513
http://hdl.handle.net/11536/77495
Appears in Collections:Thesis


Files in This Item:

  1. 151301.pdf