Research and Implementation of Web Crawler Based on Learning Resources

Xiang-wei TAN, Yi ZHANG, Zheng-jun PAN

Abstract


Aiming at the current situation that the effective learning resources are difficult to be acquired in college software teaching, it is proposed that the open-source web crawler framework Heritrix be redeveloped to study the function of crawler's network resource identification and data screening and to propose a theme-based Data acquisition program, and on this basis to achieve a network-based reptiles designated theme system to meet the teaching of university software for high-quality learning resources needs.

Keywords


Learning Resources, Web Crawler, Heritrix


DOI
10.12783/dtcse/cmee2017/19963

Refbacks

  • There are currently no refbacks.