Autosoft Journal

Online Manuscript Access


Chinese WeChat and Blog Hot Words Detection Method Based on Chinese Semantic Clustering


Authors



Abstract

This paper proposes a hot topic detection method based on Chinese semantic clustering. The method is aimed at high-dimensional Chinese WeChat and fragmentation of information. In order to analysis the sparse and content fragmentation features of Chinese WeChat and Blog data, we combine multiple strategies that repeated string computation, context adjacency analysis and linguistic rule filtering to abstract meaningful sentences, which can express independent and complete semantics. Then we construct the model of Chinese WeChat data in a relatively small and meaningful string space, and generate candidates2019 topics via feature clustering and pick up the hot topics according to the heat sorting. The experimental result on the WeChat data and Blog data shows that the method can reduce the dimension of high-dimension sparse space of the blog in a way, which is effective and feasible to the WeChat hot topic detection method.


Keywords


Pages

Total Pages: 6
Pages: 613-618

DOI
10.1080/10798587.2017.1316075


Manuscript ViewPdf Subscription required to access this document

Obtain access this manuscript in one of the following ways


Already subscribed?

Need information on obtaining a subscription? Personal and institutional subscriptions are available.

Already an author? Have access via email address?


Published

Volume: 23
Issue: 4
Year: 2017

Cite this document


References

Feng, Haodi et al. "Accessor Variety Criteria for Chinese Word Extraction." Computational Linguistics 30.1 (2004): 75-93. Crossref. Web. https://doi.org/10.1162/089120104773633394

Lee, Chung-Hong, Chih-Hong Wu, and Tzan-Feng Chien. "BursT: A Dynamic Term Weighting Scheme for Mining Microblogging Messages." Lecture Notes in Computer Science (2011): 548-557. Crossref. Web. https://doi.org/10.1007/978-3-642-21111-9_62

Nguyen D.T. The Scientific World Journal

Steinbach M. KDD workshop on text mining

JOURNAL INFORMATION


ISSN PRINT: 1079-8587
ISSN ONLINE: 2326-005X
DOI PREFIX: 10.31209
10.1080/10798587 with T&F
IMPACT FACTOR: 0.652 (2017/2018)
Journal: 1995-Present




CONTACT INFORMATION


TSI Press
18015 Bullis Hill
San Antonio, TX 78258 USA
PH: 210 479 1022
FAX: 210 479 1048
EMAIL: tsiepress@gmail.com
WEB: http://www.wacong.org/tsi/