標題: 目標規劃模式在觀察性研究配對問題上的最佳化研究
A Goal-Programming Model for the Matching Problem in Observational Studies
作者: 童宜亮
Tung, Yi-Liang
許錫美
洪暉智
Hsu, Hsi-Mei
Hung, Hui-Chih
工業工程與管理系所
關鍵字: 配對問題;關聯結構;混合整數規劃;啟發式演算法;分群演算法;Matching Problem;Copula;Mixed Integer Program;Meta-heuristic;Clustering Algorithm
公開日期: 2013
摘要: 本研究探討利用最佳化數學模式來解決觀察性研究中的配對(matching)問題。作者利用關聯結構(copula)的分解概念,提出了一個嶄新的目標規劃模式。此模式的優點為可同時達成組間層次各個特性(attribute)分布上以及個人層次距離測量上的平衡,使得觀察性研究中的試驗組與藉由目標規劃模式選出的對照組有較高的可比較性。然而所提出的模式仍屬於混合整數規劃的範疇,依舊面臨了需費時計算的困難。為解決此問題,作者提出了兩種啟發式(heuristic)演算法,分別運用了遺傳演算法(genetic algorithm)與分群演算法(clustering algorithm)的技巧。經由數值研究的檢驗,證明了此兩種方法在實務運用上均有良好的特性。其中,遺傳演算法能夠有效處理小規模的資料。分群演算法主要用於處理大規模的資料。藉由分群演算法進行資料前處理,可縮小可行解的搜索範圍,以減少計算最佳解所需的時間。除此之外,本研究亦將文獻中常用的兩種配對方法與作者所提出的最佳化模式進行比較。數值研究的結果顯示,本研究所提出的目標規劃模式能夠達到最好的配對表現,配對後的結果能夠將試驗組與對照組在不同特性上的分布差異達到最小。作者亦對配對問題的最佳化研究的未來發展方向進行探討並提供建議。
The thesis considers the matching problem in observational studies. Based on the well-known copula decomposition of joint attribute distribution, we propose a novel goal-programming model to jointly achieve group-level distributional balance of all attributes and individual-level distance measure balance. The proposed optimization model is also a mixed integer program (MIP) in nature and suffers from the computational difficulty that the general MIP possesses. To alleviate the computational difficulty, two heuristic algorithms are developed and the results of numerical experiments demonstrate that the solutions by the heuristic algorithms are good enough for practical use. One heuristic is a genetic algorithm, which shows the strengths in handling small-scale data sets in experimental results. The other is a clustering algorithm based on a modified Gaussian mixture model. By pre-processing the data to form a reduced feasible region using the proposed clustering heuristic, the computation time can be significantly reduced. Also, we compare our optimization model to two popular matching methods in the literature. The numerical results show that the matching based on our optimization model performs the best in that the distributional discrepancy of the attributes between the treated group and the selected controls is the smallest among the methods that are compared. Future research directions are also discussed.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070153337
http://hdl.handle.net/11536/73985
Appears in Collections:Thesis