TY - JOUR T1 - Improved Performance of Support Vector Machine for Imbalanced Data Sets Using Oversampling and Optimization AU - Saeed, Sana AU - Choon Ong, Hong JO - Journal of Engineering and Applied Sciences VL - 13 IS - 21 SP - 9065 EP - 9077 PY - 2018 DA - 2001/08/19 SN - 1816-949x DO - jeasci.2018.9065.9077 UR - https://makhillpublications.co/view-article.php?doi=jeasci.2018.9065.9077 KW - Support vector machines KW -oversampling KW -optimization algorithm KW -noisy borderline imbalanced data sets KW -real imbalanced data sets KW -proposed methodology AB - Classification of imbalanced data sets particularly in the presence of noise is a significant problem in machine learning and data mining. Support Vector Machine (SVM) is one of the most renowned supervised classification algorithm. However, its performance becomes limited for imbalanced data sets. To improve the performance of SVM for imbalanced data sets including noisy borderline and real data sets, a methodology based on oversampling and optimization algorithm is proposed for two-class classification problems. By generating the synthetic samples in the minority class and searching the best choices of the parameters of SVM after minimizing the objective function, the performance of SVM is improved. To confirm the validity of the proposed methodology, an experimental study including noisy borderline and real imbalanced data sets was conducted. SVM was applied by using the proposed methodology, two optimization algorithms and one oversampling algorithm on all the data sets. The performance of SVM with all methods was evaluated using sensitivity, G mean and F-measure. A significantly improved performance of SVM was observed by using the proposed methodology. ER -