標題: 應用MTS於非平衡資料分析之穩健性研究 - 以行動電話檢測流程為例
An Evaluation of the Robustness of MTS for Imbalanced Data - A Case Study of the Mobile Phone Test Process
作者: 蕭宇翔
Yu-Hsiang Hsiao
蘇朝墩
沙永傑
Chao-Ton Su
David Yung-Jye Sha
工業工程與管理學系
關鍵字: 資料探勘;分類;特徵選取;非平衡資料;馬氏-田口系統;MTS;機率閾值;行動電話;data mining;classification;feature selection;imbalanced data;Mahalanobis-Taguchi System;MTS;probabilistic threshold;mobile phone
公開日期: 2004
摘要: 分類為資料探勘的主要任務之一,在分類模型建構過程中亦時常融合特徵選取,藉以提高分類效率。就二元分類問題而言,分析資料的類別數量比例通常是影響分類法能否正確學習分類模型的因素之一。我們稱一組在類別數量上呈現差距的資料為非平衡資料,此差距將可能導致分類模型學習過程發生偏差,並降低未來在少量類別上的判別敏感度,而這樣的情形並不容許於現實的應用環境中。MTS為田口玄一博士針對多變量資料所提出的診斷與預測新技術,相異於其它分類法,MTS在分類模型的建構過程是透過量測尺度的建立,而非對分析資料的學習,因此較不受資料分佈型態的影響。本研究以MTS及若干分類法對非平衡資料進行分類縮減模型的建構與類別預測,結果發現,MTS在處裡非平衡資料的分類問題上確實有較穩健、出色的結果。此外,本研究亦根據柴比雪夫定理提出機率閾值來作為MTS的分類依據,並且有不錯的表現。最後,以台灣某高科技公司的行動電話RF檢測流程為研究對象,該流程所呈現之資料即為非平衡型態,透過MTS分析,所得結果顯著地減少原有的測試屬性,並仍然保有高檢測正確性。
Classification is one of the main tasks of data mining. To execute classification efficiently, feature selection is usually merged into establishing a classification model. In binary classification problems, the ratio of the number of examples belonging to two classes in training data set is an important factor that impacts the effective learning of the classification model. If a data set contains several examples from one class and few examples from the other, we call it imbalanced data. There will be bias in the classification model that is learned from imbalanced training data set and this will result in lower sensitivity of detecting the class which has few examples in training data set. MTS is a new diagnosis and forecasting technique for multivariate data. MTS establishes a classification model by constructing a continuous measurement scale rather than learning from training data set. Therefore, MTS is not influenced by data distribution. This study compared MTS with other classification techniques and found that MTS is an outperforming and robust technique for imbalanced data. In addition, this study proposed a probabilistic threshold according to Chebyshev’s theorem for MTS and probabilistic threshold derives good classification performance. Finally, MTS was employed to analyze the RF test process in mobile phone manufacture. The data coming from RF test process is typically imbalanced type. Implementation results showed that the test attributes have been significantly reduced and RF test process could also maintain high inspection accuracy.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT009233544
http://hdl.handle.net/11536/77116
顯示於類別:畢業論文


文件中的檔案:

  1. 354401.pdf