One Method of Keyword Extraction for Tibetan News Webpage for Tibetan News Webpage

TAO XU, XIANGZHEN HE, FUCHENG WAN, XIANGHE MENG

Abstract


Keywords of Tibetan text is important in the area of text text clustering/categorization, automatic abstracting, IR and so on. However, there are no keywords in the Tibtan news WebPages. Besides, many algorithm for keywords extraction need the manually annotated corpus, so it is poor augment ability. Because Keywords can be considered as a set of words which are important and subject correlated cohesively in a document, this paper improved the CHI-Squared Statistic, use the idea of recommendation to extact keywords. Experiments from Tibtan news webpages demonstrate that this method is better than the method of TFIDF integrating with location information.

Keywords


Tibetan information processing, CHI-Squared Statistic, keywords extraction


DOI
10.12783/dtcse/iceiti2017/18815

Refbacks

  • There are currently no refbacks.