Comparison of Error Rate Prediction Methods in Classification Modeling with Classification and Regression Tree (CART) Methods for Balanced Data

Authors

  • Fitria Panca Ramadhani Universitas Negeri Padang
  • Dodi Vionanda
  • Syafriandi Syafriandi
  • Admi Salma

DOI:

https://doi.org/10.24036/ujsds/vol1-iss4/73

Abstract

CART (Classification and Regression Tree) is one of the classification algorithms in the decision tree method. The model formed in CART is a tree consisting of root nodes, internal nodes, and terminal nodes. After the model is formed, it is necessary to calculate its accuracy. The aim is to see the performance of the model. The accuracy of this model can be determined by calculating the predicted error rate in the model. The error rate prediction method works by dividing the data into training data and testing data. There are three methods in the error rate prediction method: Leave One Out Cross Validation (LOOCV), Hold Out (HO), and K-Fold Cross Validation. These methods have different performance in dividing data into training data and testing data, so there are advantages and disadvantages to each method. Therefore, a comparison was made between the three error rate prediction methods with the aim of determining the appropriate method for the CART algorithm. This comparison was made by considering several factors, for instance, variations in the mean, the number of variables, and correlations in normally distributed random data. The results of the comparison will be observed using a boxplot by looking at the median error rate and the lowest variance. The results of this study indicate that the K-Fold Cross Validation method has the lowest median error rate and the lowest variance, so the most suitable error prediction method for the CART method is the K-Fold Cross Validation method

Published

2023-08-28

How to Cite

Fitria Panca Ramadhani, Dodi Vionanda, Syafriandi Syafriandi, & Admi Salma. (2023). Comparison of Error Rate Prediction Methods in Classification Modeling with Classification and Regression Tree (CART) Methods for Balanced Data. UNP Journal of Statistics and Data Science, 1(4), 271–279. https://doi.org/10.24036/ujsds/vol1-iss4/73

Most read articles by the same author(s)

<< < 1 2 3 4 5 6 7 > >>