標題: 應用於線上社群網路分析之雲端運算平行分群演算法
Truss-Based Clustering Algorithms in MapReduce for Analyzing Massive Online Social Network
作者: 謝子強
Hsieh, Tzu-Chiang
高榮鴻
Gau, Rung-Hung
電信工程研究所
關鍵字: 社群網路服務;社群網路分析;雲端計算;social networking service;social network analysis;MapReduce
公開日期: 2013
摘要: 隨著社群網路服務的興盛,營運商對於社群網路的資料分析十分感興 趣。然而,在大量資料的限制之下,傳統運行在單一電腦上的演算法已經無法承受使用者所產生的海量資料。因此,如何設計雲端的平行演算法便成為一項重要的課題。本文關注於設計高凝聚性子圖 (k − truss) 之雲端平行分群演算法,我們提出了兩個演算法來降低計算的時間成本。以上的演算法分別應用於兩種不同的情況,第一個演算法適用於靜態網路,意即對於社群網路的一個時間點做分析;第二個演算法適用於動態網路,對於一個社群網路選擇兩個較近的時間點,我們的演算法可以利用前一個時間點的分析結果來加速當前的運算。在驗證的部分,我們分別測試了人工合成的資料和實際的資料,包含了電子郵件網路、線上社群網路等。以真實的資料測試靜態網路分群演算法,在特定的情況可以達到加速的效果,若是處於效率較差的情形,演算法也可以自動切換成原始的法子避免計算成本的增加。以人工合成的資料測試動態網路分群演算法,在網路變化不大的情況之下可降低執行時間。
With the rise of social networking service, operators are very interested in the data analysis of social network. However, due to the large data size, traditional algorithms running on single computer cannot deal with it. Thus, how to design an algorithm in MapReduce is an important issue. The thesis focuses on designing algorithms for finding cohesive subgraph (k − truss) in MapReduce. We propose two algorithms applied to two situations. The first one is designed for static social networks. In other words, it can analyze a social network at a snapshot. The second one is designed for dynamic social networks. Given two close snapshots of a social network, we use the analytical results of the first one to speed up the analysis of the second one. We test the proposed algorithms by both synthetic data and real world data including email communication network, online social network and so on. The first algorithm is faster than the original one when applied to real world data. In case the proposed algorithm does not improve the performance, our algorithm can automatically switch to the original algorithm. We verify the second algorithm using synthetic data. It is faster than the original one if the change of networks is below a threshold.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT070060216
http://hdl.handle.net/11536/72188
顯示於類別:畢業論文


文件中的檔案:

  1. 021601.pdf
  2. 021603.pdf