Prediction of Immune Diseases by Physicochemical Properties of Antigen Proteins
|關鍵字:||免疫疾病;免疫抗原;IEDB;資料探勘;物理化學屬性;屬性選擇;Immune-related diseases;Immunogen;IEDB;Data mining;physicochemical features;feature selection|
|摘要:||免疫系統可以保護人體抵抗病毒、細菌、寄生蟲等外來物質的侵害，我們稱任何能引發免疫反應的外來物質或是有機體為免疫抗原。常見的免疫疾病包括過敏疾病、自體免疫疾病以及傳染性疾病，而過敏原、自體抗原以及具有傳染力的病原體分別為造成三種免疫疾病之免疫抗原。近年來，由於地理位置、環境汙染等因素，罹患免疫疾病的人口數不斷增加，受到感染的個體通常需要數個月的時間才能完全康復，症狀較嚴重者會導致死亡，因此降低個體暴露在免疫抗原進而導致疾病的機會以及辨識出由不同免疫抗原引發的疾病，使得我們能夠提供適當的治療是相當重要。雖然有些研究是針對由過敏原所造成的疾病，但很少研究是針對由自體抗原與具有傳染性的病原體所造成的疾病。本論文的研究目的為兩部分 : (1) 研究免疫抗原之物理化學性質，(2) 比較基於物理化學性質之分類方法。在這項研究中，我們蒐集172條已知B細胞表位之免疫抗原，其中屬於過敏原有35條、自體抗原有34條以及具有傳染性的病原體有103條。實驗結果顯示1-NN的分類準確性勝過於C4.5與SVM，而Flexibility屬性為區分三種免疫疾病之關鍵屬性。|
Human bodies protected by our immune system resist viruses, bacteria, parasites etc. We call any foreign substance or organism that provokes an immune response an immunogen. Common immune-related diseases include allergies, autoimmune diseases and several infectious diseases. Allergens, autoantigens and pathogens are the immunogens that cause the three immune-related diseases. In recent years, the number of people suffering from immune-related diseases has been increasing because of the factors such as geographical locations, environment pollutions, etc. Patients who have diseases of the immune system often require several months to recover, but some may even die if the health condition is too serious. Therefore, it is important not only to reduce the chance of the body exposure to the immunogens that cause the diseases, but also to distinguish the diseases provoked by different immunogenes so that we can provide the appropriate treatments. While there are some studies on the disease caused by allergens, little research has been conducted on other immune-related diseases caused by autoantigens and pathogens. The objective of this thesis is two-fold: (a) studying the physicochemical properties of immunogenes, and (b) comparing computational methods that classify immune-related diseases based on the physicochemical properties. For this study, we collected 172 immunogens with known B-cell epitopes, including 35 allergens, 34 autoantigens and 103 pathogens. Experimental results show that 1-NN outperforms both C4.5 and SVM for the classification accuracy, and the attribute, flexibility, plays the key role in disease classification.