APPLIED MATHEMATICAL TECHNIQUES AND MACHINE LEARNING FOR IDENTIFYING PREDICTORS OF TYPE 2 DIABETES
Valbona MAZLAMI, Vesna ANTONSKA KNIGHTS
Abstract
Aim: This study aims to identify key predictors of type 2 diabetes mellitus (T2DM) by applying mathematical and machine learning techniques to clinical health data collected in North Macedonia.
Method: Data were collected from 723 clinical records at the Clinical Hospital “Mother Teresa” in Skopje over 9 months, including patient demographics, BMI, lipid profile, HbA1c, and hypertension status. A hybrid analytical framework was developed, integrating exploratory data analysis, statistical correlation testing, and supervised machine learning algorithms—specifically logistic regression, k-nearest neighbors (KNN), and decision tree classifiers. Model performance was evaluated using accuracy, precision, recall, F1-score, and ROC curves.
Results: The results revealed statistically significant associations between elevated HbA1c, high BMI, and diabetes diagnosis. Logistic regression achieved the highest performance with 80% accuracy and an AUC of 0.85. Across models, HbA1c, age, and BMI consistently emerged as the most influential predictors of T2DM. Hypertension and abnormal lipid risk were also positively associated with diabetes outcomes.
Conclusions: This research confirms that machine learning models—especially logistic regression—can effectively predict diabetes risk using a small set of metabolic and anthropometric indicators. The integration of statistical analysis and ML techniques enhances the ability to detect early warning signs of T2DM and provides a foundation for future clinical decision-support tools.
Pages: 89 - 99