Comparing Classification and Regression Tree and Logistic Regression Algorithms Using 5×2cv Combined F-Test on Diabetes Mellitus Dataset


  • Fashihullisan Universitas Negeri Padang
  • Dodi Vionanda
  • Yenni Kurniawati
  • Fadhilah Fitri



Classification is the process of finding a model that describes and distinguishes data classes that aim to be used to predict the class of objects whose class labels are unknown. There are several algorithms in classification, such as classification trees and regression trees (CART) and logistic regression. The k-fold cross validation method has a weakness for algorithm comparison problems it is possible at different folds to produce different error predictions, so that the results of comparing algorithm performance will also be different. There for in the problem of comparison of algorithms, the researcher will apply the 52cv t test method and the 52cv combined F test. Out of 100 iterations the 10-fold cross validation method was only consistent three times which shows that the k-fold cross validation method has poor consistency in comparing the CART algorithm and logistic regression for diabetes mellitus data. In addition, 52cv combined F test and 52cv t test methods that have been carried out show that 52cv combined F test is better used to get conclusions from the results of a comparison of the two algorithms because it only produces one decision, in contrast to 52cv t test which has the possibility to get different decisions from 10 test statistics which results makes it difficult for researchers to draw conclusions in comparing the cart algorithm and logistic regression




Cara Mengutip

Fashihullisan, Dodi Vionanda, Yenni Kurniawati, & Fadhilah Fitri. (2023). Comparing Classification and Regression Tree and Logistic Regression Algorithms Using 5×2cv Combined F-Test on Diabetes Mellitus Dataset. UNP Journal of Statistics and Data Science, 1(4), 344–352.

Artikel paling banyak dibaca berdasarkan penulis yang sama

1 2 3 4 5 6 7 8 > >>