Comparison of Error Rate Prediction Methods in Binary Logistic Regression Modeling for Imbalanced Data
DOI:
https://doi.org/10.24036/ujsds/vol1-iss4/86Keywords:
Binary Logistc Regression, Hold Out, Imbalanced data, K-fold Cross Validation, Leave One OutAbstract
Binary logistic regression is a regression analysis used in classification modeling. The performance of binary logistic regression can be seen from the accuracy of the model formed. Accuracy can be measured by predicting the error rate. One method of predicting the error rate that is often used is cross-validation. There are three algorithms in cross-validation: leave one out, hold out, and k-fold. Leave one out is a method that divides data based on the number of observations so that each observation has the opportunity to become testing data but requires a long time in the analysis process when the number of observations is large. Hold out is the simplest algorithm that only divides the data into two parts randomly, so there is a possibility that important data does not become training data. K-fold is an algorithm that divides data into several groups, but k-fold is not suitable for data that has a small number of observations. In reality, real data is often imbalanced. In logistic regression,when the data is increasingly imbalanced, the prediction results will approach the number of minority classes. This research focuses on the comparison of error rate prediction methods in binary logistic regression modeling with imbalanced data. This study uses three types of data, namely univariate, bivariate, and multivariate, which are generated by differences in population mean and correlation between independent variables.The results obtained show that the k-fold algorithm is the most suitable error rate prediction algorithm applied to binary logistic regression.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Bahri Annur Sinaga, Dodi Vionanda, Dony Permana, Admi Salma
This work is licensed under a Creative Commons Attribution 4.0 International License.