A Fully Bayesian Logistic Regression Model for Classification of ZADA Diabetes Dataset

Authors

  • Masoud M. Hassan Dept. of Computer Science, Faculty of Science, University of Zakho, Kurdistan Region, Iraq.

DOI:

https://doi.org/10.25271/sjuoz.2020.8.3.707

Keywords:

Diabetes, Bayesian Logistic Regression, Markov Chain Monte Carlo, Classification, Informative Priors

Abstract

Classification of diabetes data with existing data mining and machine learning algorithms is challenging and the predictions are not always accurate. We aim to build a model that effectively addresses these challenges (misclassification) and can accurately diagnose and classify diabetes. In this study, we investigated the use of Bayesian Logistic Regression (BLR) for mining such data to diagnose and classify various diabetes conditions. This approach is fully Bayesian suited for automating Markov Chain Monte Carlo (MCMC) simulation. Using Bayesian methods in analysing medical data is useful because of the rich hierarchical models, uncertainty quantification, and prior information they provide. The analysis was done on a real medical dataset created for 909 patients in Zakho city with a binary class label and seven independent variables. Three different prior distributions (Gaussian, Laplace and Cauchy) were investigated for our proposed model implemented by MCMC. The performance and behaviour of the Bayesian approach were illustrated and compared with the traditional classification algorithms on this dataset using 10-fold cross-validation. Experimental results show overall that classification under BLR with informative Gaussian priors performed better in terms of various accuracy metrics. It provides an accuracy of 92.53%, a recall of 94.85%, a precision of 91.42% and an F1 score of 93.11%. Experimental results suggest that it is worthwhile to explore the application of BLR to predictive modelling tasks in medical studies using informative prior distributions.

Downloads

Download data is not yet available.

Author Biography

  • Masoud M. Hassan, Dept. of Computer Science, Faculty of Science, University of Zakho, Kurdistan Region, Iraq.

    Dept. of Computer Science, Faculty of Science, University of Zakho, Kurdistan Region, Iraq – (Masoud.hassan@uoz.edu.krd)

Downloads

Published

2020-09-30

Issue

Section

Science Journal of University of Zakho

How to Cite

Hassan, M. M. (2020). A Fully Bayesian Logistic Regression Model for Classification of ZADA Diabetes Dataset. Science Journal of University of Zakho, 8(3), 105-111. https://doi.org/10.25271/sjuoz.2020.8.3.707

Similar Articles

1-10 of 152

You may also start an advanced similarity search for this article.

Most read articles by the same author(s)