Autosoft Journal

Online Manuscript Access


An Intelligent Incremental Filtering Feature Selection and Clustering Algorithm for Effective Classification


Authors



Abstract

We are witnessing the era of big data computing where computing the resources is becoming the main bottleneck to deal with those large datasets. In the case of high-dimensional data where each view of data is of high dimensionality, feature selection is necessary for further improving the clustering and classification results. In this paper, we propose a new feature selection method, Incremental Filtering Feature Selection (IF 2 S) algorithm, and a new clustering algorithm, Temporal Interval based Fuzzy Minimal Clustering (TIFMC) algorithm that employs the Fuzzy Rough Set for selecting optimal subset of features and for effective grouping of large volumes of data, respectively. An extensive experimental comparison of the proposed method and other methods are done using four different classifiers. The performance of the proposed algorithms yields promising results on the feature selection, clustering and classification accuracy in the field of biomedical data mining.


Keywords


Pages

Total Pages: 9

DOI
10.1080/10798587.2017.1307626


Manuscript ViewPdf Subscription required to access this document

Obtain access this manuscript in one of the following ways


Already subscribed?

Need information on obtaining a subscription? Personal and institutional subscriptions are available.

Already an author? Have access via email address?


Published

Online Article

Cite this document


References

Hassanien, AboulElla. "Fuzzy Rough Sets Hybrid Scheme for Breast Cancer Detection." Image and Vision Computing 25.2 (2007): 172-183. Crossref. Web. https://doi.org/10.1016/j.imavis.2006.01.026

Bezdek, James C. "Pattern Recognition with Fuzzy Objective Function Algorithms." (1981): n. pag. Crossref. Web. https://doi.org/10.1007/978-1-4757-0450-1

Bhatt, Rajen B., and M. Gopal. "On Fuzzy-Rough Sets Approach to Feature Selection." Pattern Recognition Letters 26.7 (2005): 965-975. Crossref. Web. https://doi.org/10.1016/j.patrec.2004.09.044

Bennasar, Mohamed, Yulia Hicks, and Rossitza Setchi. "Feature Selection Using Joint Mutual Information Maximisation." Expert Systems with Applications 42.22 (2015): 8520-8532. Crossref. Web. https://doi.org/10.1016/j.eswa.2015.07.007

Bermejo, Pablo et al. "Fast Wrapper Feature Subset Selection in High-Dimensional Datasets by Means of Filter Re-Ranking." Knowledge-Based Systems 25.1 (2012): 35-44. Crossref. Web. https://doi.org/10.1016/j.knosys.2011.01.015

Chen, Degang, Qinghua Hu, and Yongping Yang. "Parameterized Attribute Reduction with Gaussian Kernel Based Fuzzy Rough Sets." Information Sciences 181.23 (2011): 5169-5179. Crossref. Web. https://doi.org/10.1016/j.ins.2011.07.025

Chen, Ying-Hsiang, and Sung-Nien Yu. "Selection of Effective Features for ECG Beat Recognition Based on Nonlinear Correlations." Artificial Intelligence in Medicine 54.1 (2012): 43-52. Crossref. Web. https://doi.org/10.1016/j.artmed.2011.09.004

Cornelis, Chris et al. "Attribute Selection with Fuzzy Decision Reducts." Information Sciences 180.2 (2010): 209-224. Crossref. Web. https://doi.org/10.1016/j.ins.2009.09.008

Cover T. Elements of information theory

Duranton M HiPEAC High-Performance Embedded Architecture and Compilation

Fraley, Chris, and Adrian E Raftery. "Model-Based Clustering, Discriminant Analysis, and Density Estimation." Journal of the American Statistical Association 97.458 (2002): 611-631. Crossref. Web. https://doi.org/10.1198/016214502760047131

Flores-Sintas, Antonio, José M. Cadenas, and Fernando Martin. "Detecting Homogeneous Groups in Clustering Using the Euclidean Distance." Fuzzy Sets and Systems 120.2 (2001): 213-225. Crossref. Web. https://doi.org/10.1016/S0165-0114(99)00110-4

Havens, T. C. et al. "Fuzzy c-Means Algorithms for Very Large Data." IEEE Transactions on Fuzzy Systems 20.6 (2012): 1130-1146. Crossref. Web. https://doi.org/10.1109/TFUZZ.2012.2201485

Hu, Qinghua, Daren Yu, and Zongxia Xie. "Information-Preserving Hybrid Data Reduction Based on Fuzzy-Rough Techniques." Pattern Recognition Letters 27.5 (2006): 414-423. Crossref. Web. https://doi.org/10.1016/j.patrec.2005.09.004

Qinghua Hu et al. "Fuzzy Probabilistic Approximation Spaces and Their Information Measures." IEEE Transactions on Fuzzy Systems 14.2 (2006): 191-201. Crossref. Web. https://doi.org/10.1109/TFUZZ.2005.864086

Hu, Qinghua et al. "Gaussian Kernel Based Fuzzy Rough Sets: Model, Uncertainty Measures and Applications." International Journal of Approximate Reasoning 51.4 (2010): 453-471. Crossref. Web. https://doi.org/10.1016/j.ijar.2010.01.004

Jensen, Richard, and Qiang Shen. "Fuzzy-rough Attribute Reduction with Application to Web Categorization." Fuzzy Sets and Systems 141.3 (2004): 469-485. Crossref. Web. https://doi.org/10.1016/S0165-0114(03)00021-6

Jensen, Richard, and Qiang Shen. "Fuzzy-Rough Sets Assisted Attribute Selection." IEEE Transactions on Fuzzy Systems 15.1 (2007): 73-89. Crossref. Web. https://doi.org/10.1109/TFUZZ.2006.889761

Jensen, R., and Qiang Shen. "New Approaches to Fuzzy-Rough Feature Selection." IEEE Transactions on Fuzzy Systems 17.4 (2009): 824-838. Crossref. Web. https://doi.org/10.1109/TFUZZ.2008.924209

Kaufman, Leonard, and Peter J. Rousseeuw, eds. "Finding Groups in Data." Wiley Series in Probability and Statistics (1990): n. pag. Crossref. Web. https://doi.org/10.1002/9780470316801

Kwok, Terence et al. "Parallel Fuzzy c- Means Clustering for Large Data Sets." Lecture Notes in Computer Science (2002): 365-374. Crossref. Web. https://doi.org/10.1007/3-540-45706-2_48

Lam, Yau King, and Peter W.M. Tsang. "eXploratory K-Means: A New Simple and Efficient Algorithm for Gene Clustering." Applied Soft Computing 12.3 (2012): 1149-1157. Crossref. Web. https://doi.org/10.1016/j.asoc.2011.11.008

Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, & Byers A H. (2011). “Big data: the next frontier for innovation, competition, and productivity”, McKinsey Global Institute.

Melnykov, Volodymyr, and Ranjan Maitra. "Finite Mixture Models and Model-Based Clustering." Statistics Surveys 4.0 (2010): 80-116. Crossref. Web. https://doi.org/10.1214/09-SS053

Modenesi, Marta V. et al. "Parallel Fuzzy c-Means Cluster Analysis." High Performance Computing for Computational Science - VECPAR 2006 52-65. Crossref. Web. https://doi.org/10.1007/978-3-540-71351-7_5

Pimentel, Bruno A., and Renata M.C.R. de Souza. "A Multivariate Fuzzy c-Means Method." Applied Soft Computing 13.4 (2013): 1592-1607. Crossref. Web. https://doi.org/10.1016/j.asoc.2012.12.024

Qian, Yuhua et al. "Fuzzy-Rough Feature Selection Accelerator." Fuzzy Sets and Systems 258 (2015): 61-78. Crossref. Web. https://doi.org/10.1016/j.fss.2014.04.029

Soto, J., A. Flores-Sintas, and J. Palarea-Albaladejo. "Improving Probabilities in a Fuzzy Clustering Partition." Fuzzy Sets and Systems 159.4 (2008): 406-421. Crossref. Web. https://doi.org/10.1016/j.fss.2007.08.016

Timón, Isabel et al. "Parallel Implementation of Fuzzy Minimals Clustering Algorithm." Expert Systems with Applications 48 (2016): 35-41. Crossref. Web. https://doi.org/10.1016/j.eswa.2015.11.011

Tsang, E.C.C. et al. "Attributes Reduction Using Fuzzy Rough Sets." IEEE Transactions on Fuzzy Systems 16.5 (2008): 1130-1141. Crossref. Web. https://doi.org/10.1109/TFUZZ.2006.889960

Yao, Yanqing, Jusheng Mi, and Zhoujun Li. "A Novel Variable Precision -Fuzzy Rough Set Model Based on Fuzzy Granules." Fuzzy Sets and Systems 236 (2014): 58-72. Crossref. Web. https://doi.org/10.1016/j.fss.2013.06.012

Zeng, Anping et al. "A Fuzzy Rough Set Approach for Incremental Feature Selection on Hybrid Information Systems." Fuzzy Sets and Systems 258 (2015): 39-60. Crossref. Web. https://doi.org/10.1016/j.fss.2014.08.014

Suyun Zhao, E. Tsang, and Degang Chen. "The Model of Fuzzy Variable Precision Rough Sets." IEEE Transactions on Fuzzy Systems 17.2 (2009): 451-467. Crossref. Web. https://doi.org/10.1109/TFUZZ.2009.2013204

JOURNAL INFORMATION


ISSN PRINT: 1079-8587
ISSN ONLINE: 2326-005X
DOI PREFIX: 10.31209
10.1080/10798587 with T&F
IMPACT FACTOR: 0.652 (2017/2018)
Journal: 1995-Present




CONTACT INFORMATION


TSI Press
18015 Bullis Hill
San Antonio, TX 78258 USA
PH: 210 479 1022
FAX: 210 479 1048
EMAIL: tsiepress@gmail.com
WEB: http://www.wacong.org/tsi/