Comparison of Error Rate Prediction Methods of C4.5 Algorithm for Imbalanced Data
DOI:
https://doi.org/10.24036/ujsds/vol1-iss4/89Keywords:
C4.5 Algorithm, Error Rate Prediction Methods, Imbalanced DataAbstract
Classification modeling can be formed using the C4.5 algorithm. The model formed by the C4.5 algorithm needs to be seen for its prediction accuracy using the error rate prediction method. Imbalanced data causes an increase in the classification error of the C4.5 algorithm because the prediction results do not represent the entire data and worsen the performance of the error rate prediction method. Meanwhile, the case of data with different correlations is carried out to find out whether different correlations affect the performance of the error rate prediction method. The purpose of the research is to find out the most suitable error rate prediction method applied to the C4.5 algorithm in the case of imbalanced data and the influence of different correlations. The results show that the K-Fold CV method is the most suitable prediction method applied to the C4.5 algorithm for imbalanced data cases compared to the HO and LOOCV methods. In addition, high correlation can worsen the performance of error rate prediction methods.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Yunistika Ilanda, Dodi Vionanda, Yenni Kurniawati, Dina Fitria
This work is licensed under a Creative Commons Attribution 4.0 International License.