College Students-Oriented Internet Public Opinion System
Abstract
To improve the Internet public opinion monitoring for colleges, this paper designs an efficient Internet public opinion system for college students. The system uses distributed web crawler to collect the structuring information from the news websites, social networking sites, BBS and blogs, which are widely visited by college students. To speed up the text clustering process, the Single-Pass algorithm is parallelized with Spark. The parallel Single-Pass algorithm is used to find hot topics from collected texts. The algorithm improved the clustering efficiency significantly through distributed processing. Simulation results reveal that the proposed system can achieve timely and accurately public opinion monitoring.
Keywords
Distributed web crawler, Feature extraction, Text clustering, Parallelization
DOI
10.12783/dtcse/wcne2017/19827
10.12783/dtcse/wcne2017/19827
Refbacks
- There are currently no refbacks.