A Weighted Method to Improve the Centroid-based Classifier

Chuan LIU, Wen-yong WANG, Guang-hui TU, Nan-nan LIU, Yu XIANG

Abstract


Centroid-Based Classifier (CBC) is one of the most widely used text classification method due to its theoretical simplicity and computational efficiency. However, the accuracy of CBC is not satisfactory when it deals with the skewed distributed data. In this paper, we propose a new classification model named as Gravitation Model (GM) to solve the model misfit of CBC. In the proposed model, we give each category a mass factor to indicate its distribution in vector space and this factor can be learned from training data. We provide the performance comparisons with CBC and its improved methods based on the results of experiments conducted on twelve real datasets, which show that the proposed gravitation model consistently outperforms CBC. Furthermore, it reaches the same performance as the best centroid-based classifier and is more stable than the best one.

Keywords


Text categorization, Centroid-based classifier, Machine learning, Gravitation model.


DOI
10.12783/dtetr/iceea2016/6704

Refbacks

  • There are currently no refbacks.