標題: SCMPSP: Prediction and characterization of photosynthetic proteins based on a scoring card method
作者: Vasylenko, Tamara
Liou, Yi-Fan
Chen, Hong-An
Charoenkwan, Phasit
Huang, Hui-Ling
Ho, Shinn-Ying
生物科技學系
生物資訊及系統生物研究所
Department of Biological Science and Technology
Institude of Bioinformatics and Systems Biology
公開日期: 21-一月-2015
摘要: Background: Photosynthetic proteins (PSPs) greatly differ in their structure and function as they are involved in numerous subprocesses that take place inside an organelle called a chloroplast. Few studies predict PSPs from sequences due to their high variety of sequences and structues. This work aims to predict and characterize PSPs by establishing the datasets of PSP and non-PSP sequences and developing prediction methods. Results: A novel bioinformatics method of predicting and characterizing PSPs based on scoring card method (SCMPSP) was used. First, a dataset consisting of 649 PSPs was established by using a Gene Ontology term GO: 0015979 and 649 non-PSPs from the SwissProt database with sequence identity <= 25%.- Several prediction methods are presented based on support vector machine (SVM), decision tree J48, Bayes, BLAST, and SCM. The SVM method using dipeptide features-performed well and yielded - a test accuracy of 72.31%. The SCMPSP method uses the estimated propensity scores of 400 dipeptides - as PSPs and has a test accuracy of 71.54%, which is comparable to that of the SVM method. The derived propensity scores of 20 amino acids were further used to identify informative physicochemical properties for characterizing PSPs. The analytical results reveal the following four characteristics of PSPs: 1) PSPs favour hydrophobic side chain amino acids; 2) PSPs are composed of the amino acids prone to form helices in membrane environments; 3) PSPs have low interaction with water; and 4) PSPs prefer to be composed of the amino acids of electron-reactive side chains. Conclusions: The SCMPSP method not only estimates the propensity of a sequence to be PSPs, it also discovers characteristics that further improve understanding of PSPs. The SCMPSP source code and the datasets used in this study are available at http://iclab.life.nctu.edu.tw/SCMPSP/.
URI: http://dx.doi.org/10.1186/1471-2105-16-S1-S8
http://hdl.handle.net/11536/124720
ISSN: 1471-2105
DOI: 10.1186/1471-2105-16-S1-S8
期刊: BMC BIOINFORMATICS
Volume: 16
顯示於類別:期刊論文