Secure Machine Learning Frameworks for Predicting Human Infections from Internet-Driven Health Data
Rawan Nassri AbulailAssociate Professor, Computer Science Department, Philadelphia University, Amman, Jordan. rabulail@philadelphia.edu.jo0009-0002-8168-2455
Mohammed Yaarub Al-HadithiCollege of Information Technology, Philadelphia University, Amman, Jordan. 201910814@philadelphia3.onmicrosoft.com0009-0003-9005-5143
Healthcare systems have increasingly relied on intelligent computational tools to manage the rising complexity of diagnosing human infectious diseases, especially in environments where early clinical intervention is limited or delayed. Since many infections spread through close physical contact and exhibit overlapping physiological symptoms, rapid and reliable identification of infected individuals is essential to prevent transmission, isolate high-risk cases, and protect healthy populations. With the expansion of internet-enabled health monitoring, smart devices now serve as continuous data sources, providing real-time physiological indicators that support large-scale diagnostic analytics. This study proposes a secure machine learning framework to detect human infections using two curated datasets: the first includes physiological parameters of 941 infected individuals, such as body temperature, blood oxygen saturation, and respiratory rate, while the second contains similar records from 940 non-infected individuals who share overlapping symptoms. Multiple classification models, including K-Nearest Neighbors (KNN), Support Vector Machine (SVM), and Decision Tree, were trained and evaluated using the KNIME analytics platform. Additionally, Logistic Regression and Random Forest models were tested to benchmark performance. Although Logistic Regression and Random Forest achieved the highest accuracy, Random Forest was selected as the optimal model due to its superior sensitivity score of 0.8275, outperforming SVM and Logistic Regression. The findings highlight the potential of secure, internet-driven machine learning systems to enhance early detection of human infections and support real-time clinical decision-making in distributed healthcare environments.