標題: 雲端數據管理上空間索引之鍵制定格式
Key Formulation Schemes for Spatial Index in Cloud Data Managements
作者: 許雅婷
Hsu, Ya-Ting
彭文志
Peng, Wen-Chih
資訊科學與工程研究所
關鍵字: 雲端運算;雲端數據管理;鍵-值;空間索引;多維索引;cloud computing;cloud data managements;key-value;spatial index;multi-dimensional index
公開日期: 2011
摘要: 有鑑於雲端運算的靈活性與可擴展性,現在雲端運算被大量運用在處理大規模的數據分析。在雲端運算中有一些雲端數據管理系統被開發來儲存資料(例如:HBase與Cassandra)。這些雲端數據管理系統通常提供資料以鍵-值為一對(Key-Value Pair)的儲存方式,每個鍵都可以用來存取相對應的值。HBase與Cassandra均提供一些指令(例如:Get、Set與Scan)以提供使用者給定特定鍵的情形下來得到相對應的值。現行的雲端數據管理系統直接繼承了雲端運算的特性(例如:高度擴展性與可用性)。根據上述雲端運算的特性,雲端數據管理系統被廣泛用來儲存網頁資訊(Web Data),特別是搜尋引擎的資料。然而,隨著智慧型手機(Smart Phone)及基於位置的服務(Location-Based Services)的位置不斷的變動,使得空間訊息數據(Spatial Information)在短時間內大量的擴增。因此,如何制定現有的雲端數據管理的空間訊息數據的鍵值是一個挑戰的問題。在本文中,我們提出一個在雲端數據管理上基於R+-tree的鍵制定格式(簡稱為KR+-index)。有了我們對於空間訊息數據的鍵制定格式,雲端數據管理系統可以有效率的存取這些空間訊息數據。我們基於KR+-index提出兩個常被用到的空間查詢(Spatial Index)演算法,範圍查詢(Range Query)及k個最近者查詢(k-NN Query)。此外,我們實作提出的空間索引之鍵制定格式(KR+-index)在Cassandra上並導入人造空間訊息數據以提供有效率空間查詢,range query與k-NN query。在實驗結果中顯示我們提出的KR+-index優於其他現有的空間索引之鍵制定格式與MD-HBase,特別是當空間訊息數據分布非常不平均的情形下,提升的效率更是顯著。
Due to the flexibility and scalability in cloud computing, cloud computing nowadays plays an important role to handle a large-scale data analysis. For data processing operations, several cloud data managements (CDMs), such as HBase and Cassandra, are developed. Such CDMs usually provide key-value storages, where each key is used to access its corresponding value. Both HBase and Cassandra provide some basic operations (e.g., Get, Set, Scan) to retrieve the values via keys specified by users. The exiting CDMs fully inherit the characteristics of cloud computing (i.e., high scalability and availability). With the aforementioned characteristics of cloud computing, CDMs are widely employed for Web data, especially for search engines. However, with the proliferation of smart phones and location-based services, data with spatial information, referring as spatial data, are dramatically increasing. Consequently, how to formulate keys for spatial data in the existing CDMs is a challenge issue. In this paper, we develop several key formulation schemes. In particular, we propose a novel Key formulation scheme based on R+-tree (abbreviated as KR+-index). With our design for keys of spatial data, the existing CDMs are able to efficiently retrieve spatial data. In light of KR+-index, two spatial queries, k-NN query and range query, are designed. Moreover, we implement the proposed key formulation schemes on Cassandra, and import synthetic spatial data for spatial queries. The experimental results demonstrate that KR+-index outperforms other existing key formulations and MD-HBase.
URI: http://140.113.39.130/cdrfb3/record/nctu/#GT079955622
http://hdl.handle.net/11536/50529
Appears in Collections:Thesis


Files in This Item:

  1. 562201.pdf