Comparison of Error Rate Prediction Methods in Classification Modeling with Classification and Regression Tree (CART) Methods for Balanced Data

Penulis

  • Fitria Panca Ramadhani Universitas Negeri Padang
  • Dodi Vionanda
  • Syafriandi Syafriandi
  • Admi Salma

DOI:

https://doi.org/10.24036/ujsds/vol1-iss4/73

Abstrak

CART (Classification and Regression Tree) is one of the classification algorithms in the decision tree method. The model formed in CART is a tree consisting of root nodes, internal nodes, and terminal nodes. After the model is formed, it is necessary to calculate the accuracy of the model. The aims is to see the performance of the model. The accuracy of this model can be done by calculating the predicted error rate in the model. The error rate prediction method works by dividing the data into training data and testing data. There are three methods in the error rate prediction method, such as Leave One Out Cross Validation (LOOCV), Hold Out (HO), and K-Fold Cross Validation. These methods have different performance in dividing data into training data and testing data, so there are advantages and disadvantages to each method. Therefore, a comparison was made for the three error rate prediction methods with the aim of determining the appropriate method for the CART algorithm. This comparison was made by considering several factors, for instance variations in the mean, number of variables, and correlations in normal distributed random data. The results of the comparison will be observed using a boxplot by looking at the median error rate and the lowest variance. The results of this study indicate that the K-Fold Cross Validation has the median error rate and the lowest variance, so the most suitable error prediction method used for the CART method is the K-Fold Cross Validation method.

Unduhan

Diterbitkan

2023-08-28

Cara Mengutip

Fitria Panca Ramadhani, Dodi Vionanda, Syafriandi Syafriandi, & Admi Salma. (2023). Comparison of Error Rate Prediction Methods in Classification Modeling with Classification and Regression Tree (CART) Methods for Balanced Data . UNP Journal of Statistics and Data Science, 1(4), 271–279. https://doi.org/10.24036/ujsds/vol1-iss4/73

Artikel paling banyak dibaca berdasarkan penulis yang sama

1 2 3 4 5 6 7 > >>