Research on Parallel Design of DBSCAN Clustering Algorithm in Spatial Data Mining
Abstract
DBSCAN clustering algorithm uses fixed Eps and Minpts. When density distribution is uneven, the effect of clustering is not ideal, and the time complexity of the algorithm is O (n2). To solve the above problems, this paper proposes a parallel grid clustering algorithm and two cluster merging strategies of DBSCAN based on Spark platform, will find the Eps neighborhood to narrow the scope of the eight adjacent cells within the data object, and the parallel execution of the local clustering data with fast global clustering. The experiment shows that the improved DBSCAN parallel algorithm has better acceleration ratio and extensibility.
Keywords
Distributed computing, DBSCAN algorithm, Spark, MapReduce, Data segmentation
DOI
10.12783/dtetr/ecar2018/26370
10.12783/dtetr/ecar2018/26370
Refbacks
- There are currently no refbacks.