A modified adaptive synthetic SMOTE approach in graduation success rate classification

Gameng, Hazel; Gerardo, Bobby; Medina, Ruji

doi:10.30534/ijatcse/2019/63862019

Associated content

www.warse.org

Date

2019

Author

Gameng, Hazel

Gerardo, Bobby

Medina, Ruji

Metadata

Show full item record

Abstract

In the real research situation, the oversampling method in data preprocessing is used to solve the problem in imbalanced data. This imbalance may lessen the capability of classification algorithms to identify instances of interest that lead to misclassification such as false positive generation. These imbalanced datasets come from fields of finance, health, education, among other areas. Academic related data such as graduate success rate on higher education are at times imbalanced. One of the established oversampling methods is the Synthetic Minority Oversampling Technique (SMOTE) with Adaptive Synthetic (Adasyn) SMOTE as one of its many variations. K-Nearest Neighbors (KNN) calculations using Euclidean distance is an embedded in Adasyn. In this study, Manhattan distance is utilized in the KNN calculations. The researchers correspondingly gathered actual data from open admission programs of Davao del Norte State College for the training and testing, which consists of 14 features and 897 records. This modified Adasyn was tested on an imbalanced and primary dataset on graduation success rate using logistic regression and random forest as the classification algorithms. This was evaluated in terms of the performance measurements on overall accuracy, precision, recall, and F1 score. Results showed that the modified Adasyn dominated on each performance metrics over SMOTE and Adasyn. Thus, proving that the modified Adasyn is reliable in decreasing misclassification on the graduate success rate dataset.

URI

https://hdl.handle.net/20.500.14353/641

Recommended Citation

Gameng, H., Gerardo, B., & Medina, R. (2019). A modified adaptive synthetic SMOTE approach in graduation success rate classification. International Journal of Advanced Trends in Computer Science and Engineering, 8(6), 3053-3057.