Research on Parallel Design of DBSCAN Clustering Algorithm in Spatial Data Mining

Gong-jian ZHOU

Abstract


DBSCAN clustering algorithm uses fixed Eps and Minpts. When density distribution is uneven, the effect of clustering is not ideal, and the time complexity of the algorithm is O (n2). To solve the above problems, this paper proposes a parallel grid clustering algorithm and two cluster merging strategies of DBSCAN based on Spark platform, will find the Eps neighborhood to narrow the scope of the eight adjacent cells within the data object, and the parallel execution of the local clustering data with fast global clustering. The experiment shows that the improved DBSCAN parallel algorithm has better acceleration ratio and extensibility.

Keywords


Distributed computing, DBSCAN algorithm, Spark, MapReduce, Data segmentation


DOI
10.12783/dtetr/ecar2018/26370

Refbacks

  • There are currently no refbacks.