Autosoft Journal

Online Manuscript Access


Implementation of Web Mining Algorithm Based on Cloud Computing


Authors



Abstract

The rapid growth of the Internet exceeds all expectations. The analysis and mining of huge amounts of web data is facing a bottleneck in computing power and storage space. Through the use of cloud computing technology, we can facilitate the network access to powerful computing power, storage capacity and infrastructure. Cloud computing can effectively solve the problems by providing a data processing storage center of high reliability and scalability, which will improve the ability to process web data and reduce the requirements of the terminal devices. This paper studies web mining algorithms in a cloud computing environment. The web data mining algorithm and the MapReduce programming model are combined. We study the web mining techniques, especially the K-centers clustering algorithm, explore the combination of web mining algorithms and cloud computing technology and improve the data mining algorithms to adapt to the analysis and processing of mass web data based on cloud computing platforms. Our study constructs a distributed cloud environment using a Hadoop framework. In the experimental environment, we analyze the impact on computational performance by setting different block size parameters. Here, the block size determines the number that the pending data file is split, and the corresponding scale and amount of parallel calculation.


Keywords


Pages

Total Pages: 6
Pages: 599-604

DOI
10.1080/10798587.2017.1316077


Manuscript ViewPdf Subscription required to access this document

Obtain access this manuscript in one of the following ways


Already subscribed?

Need information on obtaining a subscription? Personal and institutional subscriptions are available.

Already an author? Have access via email address?


Published

Volume: 23
Issue: 4
Year: 2017

Cite this document


References

Bezdek, J. C. "Numerical Taxonomy with Fuzzy Sets." Journal of Mathematical Biology 1.1 (1974): 57-71. Crossref. Web. https://doi.org/10.1007/BF02339490

Dunn†, J. C. "Well-Separated Clusters and Optimal Fuzzy Partitions." Journal of Cybernetics 4.1 (1974): 95-104. Crossref. Web. https://doi.org/10.1080/01969727408546059

Fang X.J. Journal of Chemical and Pharmaceutical Research 5.12 (2013)

Huang Z.X. DMKD

Lam, C. (2010, July). Hadoop in action . Connecticut: Manning Publications, p. 325.

Langville, A.N. & Meyer, C.D. (2006). Google’s pagerank and beyond: The science of search engine rankings . Princeton: Princeton University Press, p. 28.

LEI Lei. "Towards a High Performance Virtual Hadoop Cluster." Journal of Convergence Information Technology 7.6 (2012): 292-303. Crossref. Web. https://doi.org/10.4156/jcit.vol7.issue6.35

Mahendiran A. Research Journal of Applied Sciences, Engineering and Technology 4.10 (2012)

Ruan, Shen. "Based on Cloud-Computing”s Web Data Mining." Communications and Information Processing (2012): 241-248. Crossref. Web. https://doi.org/10.1007/978-3-642-31968-6_29

Liangfei Xue, Mingyan Jiang, and Dongfeng Yuan. "Cloud Computing Model in Web Data Mining." Journal of Convergence Information Technology 7.22 (2012): 585-592. Crossref. Web. https://doi.org/10.4156/jcit.vol7.issue22.69

Zhang, Feng, and Li Liu. "Research on Data Mining Technology in Web Based on the Cloud Computing." Advanced Materials Research 532-533 (2012): 919-923. Crossref. Web. https://doi.org/10.4028/www.scientific.net/AMR.532-533.919

Xuejie, Zhang, Wang Zhijian, and Xu Feng. "Reliability Evaluation of Cloud Computing Systems Using Hybrid Methods." Intelligent Automation & Soft Computing 19.2 (2013): 165-174. Crossref. Web. https://doi.org/10.1080/10798587.2013.786969

JOURNAL INFORMATION


ISSN PRINT: 1079-8587
ISSN ONLINE: 2326-005X
DOI PREFIX: 10.31209
10.1080/10798587 with T&F
IMPACT FACTOR: 0.652 (2017/2018)
Journal: 1995-Present




CONTACT INFORMATION


TSI Press
18015 Bullis Hill
San Antonio, TX 78258 USA
PH: 210 479 1022
FAX: 210 479 1048
EMAIL: tsiepress@gmail.com
WEB: http://www.wacong.org/tsi/