標題: 社群媒體中的軌跡模式探勘
Trajectory Pattern Mining in Social Media
作者: 魏綾音
Wei, Ling-Yin
Peng, Wen-Chih
關鍵字: 社群媒體;軌跡模式探勘;軌跡搜尋;路線規劃;空間資料串流;跨值域分群;交通模式;旅遊推薦;Social Media;Trajectory Pattern Mining;Trajectory Search;Route Planning;Spatial Data Streams;Dual Clustering;Traffic Patterns;Travel Recommendation
公開日期: 2011
摘要: 隨著全球定位系統 (GPS) 應用的普及,許多行動社群應用程式被開發, 人們可以利用這些社群應用程式分享所在的地點、拍攝的相片或活動資訊 至社群網站。舉例來說,人們可以利用 GPS 軌跡記錄出遊的行程,也可以 透過 Check-in 應用程式分享所拜訪過的地點。然而這些資料都包含了時間 和空間的資訊,這些使用者產生的時空資料可以形成所謂的軌跡資料。在 論文中,我們研究如何從軌跡資料中找出模式,提出相關的演算法,應用 於旅行規劃和交通分析。 目前與旅行規劃相關的研究主要從使用者分享的軌跡資料找出 Regions-of-interest (ROIs) 和旅遊路線,然而,人們規劃旅遊行程時會有不 同的偏好,因此,在論文中,我們針對使用者查詢的空間範圍和使用者的 廣度旅遊偏好或深度旅遊偏好,從現有軌跡資料中,根據不同偏好的分數, 推薦分數高的軌跡。除此,我們也提出一個有效率的線上軌跡搜尋演算法。 事實上,因為社群應用程式的特性或省電考量,使用者所分享的軌跡有些 是由較低的採樣頻率所產生,這使得軌跡中連續兩個採樣點中間的路線是 不明確的。因此,我們進一步提出可應用在旅行規劃的路線推導架構。從 這些具不確性的軌跡資料中,找出熱門的旅遊路線。當使用者查詢一些地 點和給定的空閒時間,這個架構會找出符合時間內且經過這些地點的熱門 路線。 在論文中,我們進一步利用使用者分享的軌跡資料,提出相關的路況預 估演算法。當使用者查詢一路段在某個時間的路況時,我們可以預估被查 詢路段在該時間的可能行車速度值。然而,使用者不一定都會即時分享當下的軌跡資料,因此,我們希望從歷史軌跡資料中找出時空特徵,結合即 時和歷史軌跡資料進而有效的預估車速。除此,我們可以假設,地圖上的 每一塊區域可以看成一個靜態的感測器,偵測使用者所分享的資料,這些 感測器將感測到隨著時間變化的資料串流。為了可以有效的找出時空特徵, 我們也另外發展空間資料串流的分群演算法,可應用於交通分析,找出哪 些區域在特定時間上具有相關的路況。 在本論文中,藉由使用者分享的軌跡資料,我們提出軌跡搜尋架構和路 線推導架構且可應用於旅行規劃,除此,我們還提出路況預測和分析的相 關演算法,在實驗中,我們利用真實資料和模擬資料驗證演算法的有效性 和效能。
The increasing availability of location-acquisition technology, such as GPS, leads to various mobile social applications, and thus people can share their locations, photos, and activities in socialWebs. For example, people can record and share their trips on the Web by GPS tracks or people can perform check-in services to share their visited locations. In fact, the user-generated spatio-temporal data could be viewed as trajectories. In this dissertation, we study how to explore patterns and propose algorithms from trajectories for pattern-aware trip planning and traffic analysis. Prior studies on trip planning have developed algorithms for mining regions-ofinterest (ROIs) and travel routes from user-generated trajectories. However, while planning trips, people have differen preferences. In the dissertation, given a spatial range and a user-specified preference of depth/breadth, we first develop a pattern-aware trajectory search framework (PATS) to retrieve the top K trajectories passing through popular ROIs. In addition, we propose an efficient bounded trajectory search algorithm (BTS) to efficiently retrieve the top K trajectories. In fact, some user-contributed trajectories are usually generated at a low or an irregular frequency due to applications' characteristics or energy saving, leaving the routes between two consecutive points of a single trajectory uncertain (called an uncertain trajectory). We further present a route inference framework based on collective knowledge (RICK) for trip planning. The RICK is developed to construct the popular routes from uncertain trajectories. Explicitly, given a location sequence and a time span, the RICK is able to construct the top K routes which sequentially pass through the locations within the specified time span, by aggregating such uncertain trajectories in a mutual reinforcement way. Moreover, in this dissertation, we propose algorithms to estimate traffic from usercontributed GPS trajectories. Given a query that indicates a query road segment and a query time, we intend to accurately estimate the traffic status (i.e., the driving speed) on the query road segment at the query time from user-generated driving trajectories. However, the real-time traffic information may not be always acquirable or may be limited. Thus, it is a challenging issue to estimate traffic status via a limited amount of GPS data. Note that a traffic behavior in the same time usually reflects similar patterns (referring to the temporal feature), and nearby road segments have the similar traffic behaviors (referring to the spatial feature). By exploring the temporal and spatial features, more GPS data points are retrieved. In light of these GPS data retrieved, we exploit the weighted moving average approach to estimate traffic status on road networks. In addition, to explore spatio-temporal features from traffic data in advance, we further introduce a novel problem of clustering spatial data streams with constraints for trafficrelated applications. Previous studies have developed a dual clustering problem for spatial data by considering similarity-connected relationships in both geographic and non-geographic domains. In fact, an area on the map could be regarded as a sensor to detect user-contributed data. For sensor data, we observe that the readings from one sensor are similar for a period, and the readings refer to temporal locality features. We propose a hierarchical-based clustering algorithm (HBC) and an incremental clustering algorithm (IC) for clustering spatial data streams with constraints. In view of the increasing amount of user-contributed data from social media, we propose a pattern-aware trip planning framework, a framework of constructing popular routes, algorithms for traffic estimation and traffic analysis. We conduct extensive experiments in the dissertation with real datasets and synthetic datasets. The experimental results show that our proposed frameworks and algorithms are efficient and effectiveness.
Appears in Collections:Thesis