Machine Learning for Early Diabetes Detection and Diagnosis
Sofiene MansouriAssociate Professor, Department of Biomedical Technology, College of Applied Medical Sciences in Al-Kharj, Prince Sattam Bin Abdulaziz University and University of Tunis El Manar, Higher Institute of Medical Technologies of Tunis, Laboratory of Biophysics and Medical Technologies. s.mansouri@psau.edu.sa0000-0002-6191-3095
Souhaila BoularesUniversity of Tunis El Manar, Higher Institute of Medical Technologies of Tunis, Laboratory of Biophysics and Medical Technologies. Souhaila.boulares@istmt.utm.tn0009-0008-3870-1642
Souhir ChabchoubAssistant Professor, University of Tunis El Manar, Higher Institute of Medical Technologies of Tunis, Laboratory of Biophysics and Medical Technologies. chabchoub_souhir@yahoo.fr0000-0002-0683-7931
In this work, a machine learning (ML)-based e-diagnostic system is suggested specifically for the detection of gestational diabetes mellitus (GDM). Reviewing recent GDM data and outlining the intimate connection between GDM and prediabetic conditions, as well as the potential for future declines in insulin resistance and the emergence of overt Type 2 diabetes, were our goals. The present study explores the application of the K-nearest neighbors (KNN) algorithm to project diabetes diagnosis on the widely-used Pima Indians Diabetes database. The KNN algorithm, a non-parametric, instance-based learning method, was employed to classify individuals as either diabetic or non-diabetic, our objectives were to evaluate the algorithm’s ability to make accurate predictions and explore factors influencing its performance. The study commenced with data preprocessing, including handling missing values, feature scaling, and data splitting into training and testing sets. The KNN classifier was trained and tested using these best-fit parameters. The results of this study revealed a model with an accuracy of approximately 0.76 in predicting diabetes diagnosis. This study looked at the various machine-learning approaches for diabetes patient classification, including recall, accuracy, precision, and F1-score. The study discusses the significance of hyperparameter tuning, data preprocessing, and imbalanced data handling in achieving optimal KNN model performance. Lastly, this study shows how the KNN algorithm may be used to project diabetes using the Pima Indians Diabetes Database. The findings suggest that KNN can serve as a viable tool in the early detection of diabetes, paving the way for more extensive applications in healthcare and predictive modelling.