Classification is an important problem where the performance of a classifier depreciates as the sample size decrease and dimensionality increase. This study describes feature subset selection framework for supervised classification problem which works efficiently with very few training samples. In the proposed algorithm, the most relevant feature has been selected by using filter method and the redundancy among the features is eliminated by using correlation-based spanning tree. The proposed framework is designed to perform data analytics to extract the most influencing predictors. The complexity of the algorithm is reduced drastically by performing parallel processing of feature subsets. The performance of the algorithm is tested against various predominant feature subset selection algorithms in 4 different datasets from UCI repository and 2 real world microarray data where the classification accuracy of the proposed framework is better than the others feature selection algorithms.
L. Kamatchi Priya, M.K. Kavitha Devi and S. Nagarajan. Improvising Classification Performance for High Dimensional and
Small Sample Data Sets.
DOI: https://doi.org/10.36478/ajit.2018.261.270
URL: https://www.makhillpublications.co/view-article/1682-3915/ajit.2018.261.270