Comparison of the C5.0 Algorithm and the CART Algorithm in Stroke Classification

Authors

  • Indah Lestari Universitas Negeri Padang
  • Dina Fitria
  • Syafriandi Syafriandi
  • Admi Salma

DOI:

https://doi.org/10.24036/ujsds/vol2-iss1/144

Keywords:

C5.0 algorithm, CART algorithm, confusion matrix, stroke

Abstract

The C5.0 and CART algorithms are similar in terms of velocity and handling of categorical and numeric type data. However, these two algorithms are differences in terms the CART algorithm is binary and classifies categorical, numerical and continuous response variables resulting in classification and regression decision trees. Meanwhile, the C5.0 algorithm is non-binary and classifies categorical response variables resulting in a classification tree. This research aims to classify the Kaggle’s Stroke Prediction Dataset to find out the variables that most influence the risk of stroke, as well as to compare the results of the classification accuracy of the both algorithms. The results of the study showed that CART algorithm has a higher value of accuracy and precision, but its recall value is lower than C5.0. The accuracy value of each algorithm is 77.9% and 77.5%, presision is 89.5% and 83.2%, recall is 67% and 71.4%. Overrall, it can be concluded that there is no difference in classification between the two algorithm. Beside that, in the CART there were 3 variables that most influence on stroke risk, they are age, BMI, and average blood glucose levels. Meanwhile, in C5.0 only 2 variable that most influence, there are age and average blood glucose levels.

Published

2024-02-25

How to Cite

Indah Lestari, Dina Fitria, Syafriandi Syafriandi, & Admi Salma. (2024). Comparison of the C5.0 Algorithm and the CART Algorithm in Stroke Classification . UNP Journal of Statistics and Data Science, 2(1), 90–98. https://doi.org/10.24036/ujsds/vol2-iss1/144

Most read articles by the same author(s)

<< < 1 2 3 4 5 6 > >> 

Similar Articles

You may also start an advanced similarity search for this article.