Autosoft Journal

Online Manuscript Access

An Improved k-nearest Neighbor Algorithm using Tree Structure and Pruning Technology

by Juan Li*


K-Nearest Neighbor algorithm (KNN) is a simple and mature classification method. However there are susceptible factors influencing the classification performance, such as k value determination, the overlarge search space, unbalanced and multi-class patterns, etc. To deal with the above problems, a new classification algorithm that absorbs tree structure, tree pruning and adaptive k value method was proposed. The proposed algorithm can overcome the shortcoming of KNN, improve the performance of multi-class and unbalanced classification, reduce the scale of dataset maintaining the comparable classification accuracy. The simulations are conducted and the proposed algorithm is compared with several existing algorithms. The results indicate that the proposed algorithm can obtain higher classification efficiency and smaller search reference set on UCI datasets. Furthermore, the proposed algorithm can overcome the shortcoming of KNN and improve the performance of multi-class and unbalanced classification. This illustrates that the proposed algorithm is an expedient method in design nearest neighbour classifiers.



Total Pages: 14


Manuscript ViewPdf Subscription required to access this document

Obtain access this manuscript in one of the following ways

Already subscribed?

Need information on obtaining a subscription? Personal and institutional subscriptions are available.

Already an author? Have access via email address?


Online Article

Cite this document


Aha, D. W., Kibler, D., and Albert, M. K. (1991). Instance-based learning algorithms, Machine Learning, 6(1), 37-66.

Barandela, Ricardo, Ferri, F. J., and Sánchez, J. S.(2005). Decision boundary preserving prototype selection for nearest neighbor classification, International Journal of Pattern Recognition and Artificial Intelligence, 19(6): 787-806.

Fayed, H. A. & Atiya, A. F. (2009). A novel template reduction approach for the k-nearest neighbor method, IEEE Transactions on Neural Networks, 20(5), 890-896.

Franti, P., Virmajoki, O., and Hautamaki, V.(2006). Fast agglomerative clustering using a k-nearest neighbor graph. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11): 1875-1881.

Friedman, J. H., Bentley, J. H., and Finkel, R. A. (1977). An algorithm for finding best matches in logarithmic expected time, ACM Transactions on Mathematical Software, 3(3): 209-226.

Gong, A. & Liu, Y. (2011). Improved KNN classification algorithm by dynamic obtaining K, ECWAC 2011, Part I, CCIS 143: 320-324.

Han, Eui-Hong, George Karypis, and Vipin Kumar. "Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification." Lecture Notes in Computer Science (2001): 53-65. Crossref. Web.

Haro-García, A. D. & García-Pedrajas, N. (2009). A divide-and-conquer recursive approach for scaling up instance selection algorithms, Data Mining & Knowledge Discovery, 18(3), 392-418.

Jagadish, H. V., Ooi, B. C., Tan, K. L., Yu, C., and Zhang, R. (2005). Idistance:an adaptive b+-tree based indexing method for nearest neighbor search, ACM Transactions on Database Systems, 30(2), 364-397.

Khazaee, Ali & Ebrahimzadeh, Ataollah. (2013). Heart arrhythmia detection using support vector machines, Intelligent Automation and Soft Computing, 19(1), 1-9.

Köknar-Tezel, S. & Latecki, L. J. (2011). Improving svm classification on imbalanced time series data sets with ghost points, Knowledge & Information Systems, 28(1), 1-23.

Krishna, K., Thathachar, M. A. L. and Ramakrishnan, K. R. (2000). Voronoi networks and their probability of misclassification, IEEE Transactions on Neural Networks, 11(6): 1361-1372.

Li, B., Qin, L., and Yu, S. (2004). An adaptive k-nearest neighbor text categorization strategy, ACM Transactions on Asian Language Information Processing, 3(4), 215-226.

Li, X., Shi, D., Charastrakul, V., and Zhou, J. (2009). Advanced p-tree based k-nearest neighbors for customer preference reasoning analysis, Journal of Intelligent Manufacturing, 20(5), 569-579.

Liu, Wei & Chawla, Sanjay. (2011). Class Condence Weighted kNN Algorithms for Imbalanced Data Sets. PAKDD”11 Proceedings of the 15th PAKDD, V(II): 345-356.

Lumini, A. & Nanni, L. (2006). A clustering method for automatic biometric template selection, Pattern Recognition, 39(3), 495-497.

Mccallum, A., & Nigam, K. (1998). A comparison of event models for naive bayes text classification, IN AAAI-98 WORKSHOP ON LEARNING FOR TEXT CATEGORIZATION, 62(2), 41-48.

Milli, M. & Bulut, H. (2017). The effect of neighborhood selection on collaborative filtering and a novel hybrid algorithm, Intelligent Automation and Soft Computing, 23(2): 261-269.

Mollineda, R. A., Ferri, F. J., and Vidal, E. (2002). An efficient prototype merging strategy for the condensed 1-nn rule through class-conditional hierarchical clustering, Pattern Recognition, 35(12), 2771-2782.

Olvera-López, J. A., Carrasco-Ochoa, J. A., and Martínez-Trinidad, J. F. (2010). A new fast prototype selection method based on clustering, Pattern Analysis and Applications, 13:131–141.

Platt, John C. (2013). Using analytic QP and sparseness to speed training of support vector machines, Advances in neural information processing systems, 557-563.

Raicharoen, T., Lursinsap, C., and Lin, F. (2005). A divide-and-conquer approach to the pairwise opposite class-nearest neighbor (POC-NN) algorithm for regression problem, Pattern Recognition Letters, 26(10), 1554-1567.

Rico-Juan, J. R., & I-esta, J. M. (2012). New rank methods for reducing the size of the training set using the nearest neighbor rule, Pattern Recognition Letters, 33(10), 1434-1434.

Sample, Neal, Matthew, Haines, Mark, Arnold, and Purcell, Timothy. (2001). Optimizing Search Strategies in k-d Trees, 5th WSES/IEEE World Multiconference on (CSCC 2001).

Wang, J., Neskovic, P., and Cooper, L. N. (2006). Neighborhood size selection in the k-nearest-neighbor rule using statistical confidence, Pattern Recognition, 39(3), 417-423.

Xu, Y., Shen, F., and Zhao, J. (2012). An incremental learning vector quantization algorithm for pattern classification, Neural Computing & Applications, 21(6), 1205-1215.

Zeng, Yong, Yang, Yupu, and Zhao, Liang. (2009). Nonparametric classification based on local mean and class statistics, Expert Systems with Applications, 36(4): 8443-8448.

Zeng, Yong, Yang, Yupu, and Zhao, Liang. (2009). Pseudo nearest neighbor rule for pattern classification, Expert Systems with Applications, 36(2): 3587-3595.

Zhang, B. & Srihari, S. N.(2004). Fast k-nearest neighbor classification using cluster-based trees, IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4): 525-528.

Zheng, B., Lee, W. C., and Lee, D. L. (2003). Search K Nearest Neighbors on Air, Mobile Data Management, Springer Berlin Heidelberg, 2003:181-195.

Zou, T., Wang, Y., Wei, X., Li, Z., and Yang, G. (2014). An effective collaborative filtering via enhanced similarity and probability interval prediction, Intelligent Automation and Soft Computing, 20(4), 555-566.


ISSN PRINT: 1079-8587
ISSN ONLINE: 2326-005X
DOI PREFIX: 10.31209
10.1080/10798587 with T&F
IMPACT FACTOR: 0.652 (2017/2018)
Journal: 1995-Present

SCImago Journal & Country Rank


TSI Press
18015 Bullis Hill
San Antonio, TX 78258 USA
PH: 210 479 1022
FAX: 210 479 1048