Liver Disease Detection using Machine Learning Algorithms

Linked Agent
Hammad, Mustafa, Thesis advisor
Alqaddoumi, Abdulla , Thesis advisor
Language
English
Extent
11, 70, [7] pages.
Place of institution
Sakhir, Bahrain
Thesis Type
Thesis (Master)
Institution
University of Bahrain, College of Science, Department of Postgraduate Programs
Description
Abstract
Liver disease is considered a world health crisis with high mortality every year. This affects the patient’s life in addition to the high-cost treatment in most countries. The health field could be helped by different science sectors to reduce the effects of some diseases. Therefore, Machine Learning (ML) could be used in the health care system on diagnosing diseases and help in decision making. That will lead to saving patients’ lives, improving services, and reducing the cost. This research has
been conducted to help the health care field by exploring and diagnosing liver disease through the potential of ML. The main goal of this study is detecting liver disease
using ML algorithms. The dataset used in this research is Indian Liver Patient Dataset (ILPD) which was obtained from a public repository. Two types of ML algorithms
implemented in this work are supervised and unsupervised learning.
The supervised models are IBK, Random Forest, and Adaptive Boosting with their ensemble learning
(voting and stacking). While the unsupervised models are K-Means, DBSCAN, Gaussian Mixture Model, and Expectation Maximization. Different evaluation criteria
have been used to compare the models in supervised and unsupervised learning models. They were accuracy, recall, precision, ROC, and RMSE for classification
models while two measurements for clustering models were Silhouette Coefficient and Adjusted Rand Index. Moreover, this work applied InfoGainAttributeEval and
CorrelationAttributeEval techniques with Ranker search method for feature selection.
Also, this research used five types of balancing techniques (ADASYN, SMOTETomek, Random Oversampling, and SMOTEENN).
For supervised learning models, the results obtained in this study revealed that stacking technique outperforms
other models in detecting liver disease for positive cases which scored 100% using the random oversampling technique. In addition, the IBK model obtained 99.6% in
detecting negative cases using SMOTEENN technique. Furthermore, for unsupervised learning, Gaussian Mixture Model and Expectation Maximization have a good
performance in detecting negative cases. In addition, this research found that balancing techniques have a positive effect on increasing the model performance.
Also, the feature selection technique has a good impact on minimizing the medical tests. However, the results showed that voting technique has a limited impact on
improving the accuracy. Different ML algorithms have been used in previous works.
They were LR, NB, SVM, IBK, RF, ANN, DT, Boosting, Bagging, Stacking, Voting, K-Means, and DBSCAN. This study proposed stacking models with a random
oversampling technique in detecting liver disease and that could help physicians and health care systems with a high rate of detection positive cases. That will lead to saving patients’ live, reducing the cost, and improving the services in health care systems. Also, these techniques can increase the performance of weak classifiers
(IBK, RF, and adaBoost). Moreover, the proposed models outperform previous works in the literature review in detecting positive cases of liver disease.
Note
Thesis (Master)-University of Bahrain, College of Science, Department of Postgraduate Programs,2022
Member of
Identifier
https://digitalrepository.uob.edu.bh/id/a570193f-f780-4750-8a9f-6b40acf527a3
Same Subject