Similarity Code File Detection Model Based on Frequent Itemsets

Jian-hong JIANG; Ke WANG

doi:10.12783/dtcse/CCNT2018/24709

Similarity Code File Detection Model Based on Frequent Itemsets

Jian-hong JIANG, Ke WANG

Abstract

In order to improve the efficiency and accuracy of program source code similarity detection, an improvement on the method of code detection is made according to some deficiencies of the current research. A similar code detection model based on frequent item sets is proposed. The model constructs frequent items set data to discover repetitive code collections and automatically divide file similarity attribution. The algorithm model does not need to consider the type of the code in the detection process, and has wide applicability, not only can detect the code files of different programming languages and grammars, but also can mark out similar codes and statistic the results. Simultaneously, through experimental comparison, it is proved that the model has high accuracy and processing efficiency.

Keywords

Source code similarity, Frequent itemset, Association rule

DOI
10.12783/dtcse/CCNT2018/24709

Refbacks

There are currently no refbacks.

Username
Password
Remember me

COMPUTER SCIENCEand ENGINEERING

Similarity Code File Detection Model Based on Frequent Itemsets

Abstract

Keywords

Refbacks

COMPUTER SCIENCE
and ENGINEERING