Document
A Comparative Study of Some Machine Learning Algorithms for Breast Tumours Classification
Linked Agent
Zeki, Ahmed M., Thesis advisor
Date Issued
2022
Language
English
Extent
[1], 13, 159, 3,[1] pages
Place of institution
Sakhir, Bahrain
Thesis Type
Thesis (Master)
Institution
"""University of Bahrain, College of Science, Department of Postgraduate Programs
English Abstract
Abstract:
Breast cancer disease is the most common cancer in US women and the second
cause of cancer death among them. Breast tumour diagnosis distinguishes benign
from malignant breast tumours. The use of machine learning techniques has
revolutionized the whole process of breast cancer diagnosis. The accurate and correct
diagnosis of breast tumours can potentially reduce the mortality rate and increase the
chances of a successful treatment. Hence, the breast cancer diagnostic problems are
basically in the scope of the widely discussed classification problems.
This is a comparative study which aims to compare and evaluate the
performance of four machine learning algorithms namely Support Vector Classifier,
K-Nearest Neighbour, Decision Tree Classifier, and Gaussian Naïve Bayes for breast
tumours classification; moreover, to identify the most accurate algorithm and
recommend it's use in medical disease classification cases. The Wisconsin Diagnosis
Cancer dataset was used to train and test these models. Furthermore, the
hyperparameter tuning technique is discussed in this work due to its high influence on
the effectiveness of the learning process. The 10-Fold Cross-Validation method is
implemented to estimate the test error of each model.
The results performed by this analysis demonstrate a comprehensive trade-off
between these models and provides a detailed evaluation on the models in terms of
accuracy, precision, sensitivity, specificity, receiver operating characteristic curve
(ROC-AUC), precision-recall curve (PR-AUC), error rate, mean square error, and
absolute mean error. The experimental results showed that the SVC outperformed the
others achieving the best performance at 98.21% for the accuracy, F-measure, and
sensitivity while achieving 97.29% for the specificity, 98.65% for the ROC-AUC and
99.9% for the PR-AUC. This study can help in making more effective and reliable
disease classification and diagnostic system which will contribute towards developing
a better healthcare system by reducing overall cost, time, and mortality rate.
Member of
Identifier
https://digitalrepository.uob.edu.bh/id/9668f593-c7bf-4f33-b27c-6691c110c582