Comparative Analysis of Machine Learning Algorithms to Predict Type II Diabetes

Authors

  • Arun Singh Yadav Department of Computer Science University of Lucknow Lucknow, India Author
  • Puneet Misra Department of Computer Science University of Lucknow Lucknow, India Author

DOI:

https://doi.org/10.70454/IJMRE.2024.40106

Keywords:

Machine learning, Disease prediction, classification, Logistic Regression

Abstract

Machine Learning (ML) models are becoming robust and more accurate nowadays as the rapid increase in the amount and quality of training data. Researchers are proposing complex models for real-life problems to achieve higher accuracy, which requires high computing and other resources. In the context of the healthcare disease diagnosis, detection and prediction is still a challenge. Early diagnosis of a disease or ailment helps in timely recovery. Moreover, health been core to every individual, a lot of work is being done in this field to improve upon by using all available information.

Current paper experiments on Pima Indian Diabetes Dataset (PIDDS) in two stages A and B. The main objective of this study is to review the accuracy of the applied machine learning algorithms and analyze their efficiency in predictions. Another essential objective is to show the efficacy of simpler models. Fields like computer vision and NLP have given rise to deep learning with complex and high computational models setting the trend to apply them in almost all the fields While they help where we have an abundance of data and complex relationships, simpler models still can do wonders and on their day can challenge these behemoths. We have also applied preprocessing methods (imputation, feature selection, scaling and discretization) to improve the classification accuracy. The algorithms selected for this problem are Logistic regression (LR), Artificial Neural Networks (ANN), Support Vector Machine(SVM), Naïve Bayes (NB),  and Decision Tree(DT). LR provided the best accuracy, and the rest of the models are very close to each other.

References

[1] G. D. Magoulas and A. Prentza, “Machine Learning in Medical Applications,” Mach. Learn. Its Appl., vol. 2049, pp. 300–307, 2001.

[2] A.J. Frandsen, “Machine Learning for Disease Prediction,” p. Paper 5975, 2016. I

[3] Kononenko, “Machine learning for medical diagnosis: history, state of the art and perspective.,” Artif. Intell. Med., vol. 23, no. 1, pp. 89–109, 2001.

[4] E. . Menasalvas and C. Gonzalo-Martin, “Challenges of Medical Text and Image Processing: Machine Learning Approaches,” Springer, Cham, 2016, pp. 221–242.

[5] S. . Habibi, M. Ahmadi, and S. Alizadeh, “Type 2 Diabetes Mellitus Screening and Risk Factors Using Decision Tree: Results of Data Mining,” Glob. J. Health Sci., vol. 7, no. 5, pp. 304–310, Sep. 2015..

[6] Farran, A. M. Channanath, K. Behbehani, and T. A. Thanaraj, “Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait—a cohort study,” BMJ Open, vol. 3, no. 5, p. e002457, May 2013.

[7] M. K. M. Dhomse Kanchan B., “Study of Machine Learning Algorithms for Special Disease Prediction using Principal of Component Analysis,” 2016 Int. Conf. Glob. Trends Signal Process. Inf. Comput. Commun., pp. 5–10, 2016.

[8] Kavakiotis, O. Tsave, A. Salifoglou, N. Maglaveras, I. Vlahavas, and I. Chouvarda, “Machine Learning and Data Mining Methods in Diabetes Research,” Comput. Struct. Biotechnol. J., vol. 15, pp. 104–116, 2017..

[9] S. . Gambhir, S. K. Malik, and Y. Kumar, “Role of Soft Computing Approaches in HealthCare Domain : A Mini Review,” J. Med. Syst., 2016.

[10] LJ. Davis and K. P. Offord, “Logistic regression:Modeling Conditional Probabilities,” Emerg. Issues Methods Personal. Assess., pp. 273–283, 2013

[11] M. . Nirmala Devi, A. A. Balamurugan, and M. Reshma Kris, “Developing a modified logistic regression model for diabetes mellitus and identifying the0 important factors of type II DM,” Indian J. Sci. Technol., vol. 9, no. 4, pp. 1–8, 2016..

[12] R. . Duggal, S. Shukla, S. Chandra, B. Shukla, and S. K. Khatri, “Impact of selected pre-processing techniques on prediction of risk of early readmission for diabetic patients in India,” Int. J. Diabetes Dev. Ctries., vol. 36, no. 4, pp. 469–476, 2016.

[13] N. H. Barakat, A. P. Bradley, and M. N. H. Barakat, “Intelligible support vector machines for diagnosis of diabetes mellitus.,” IEEE Trans. Inf. Technol. Biomed., vol. 14, no. 4, pp. 1114–1120, 2010.

[14] A.. A. Al Jarullah, “Decision tree discovery for the diagnosis of type II diabetes,” 2011 Int. Conf. Innov. Inf. Technol., pp. 303–307, 2011.

[15]. W. Chen, S. Chen, H. Zhang, and T. Wu, “A hybrid prediction model for type 2 diabetes using K-means and decision tree,” Proc. IEEE Int. Conf. Softw. Eng. Serv. Sci. ICSESS, vol. 2017-Novem, no. 61272399, 2018.

[16] R. Ramezani, M. Maadi, and S. M. Khatami, “A novel hybrid intelligent system with missing value imputation for diabetes diagnosis,” Alexandria Eng. J., 2016.

[17] M. Nilashi, O. Ibrahim, M. Dalvi, H. Ahmadi, and L. Shahmoradi, “Accuracy Improvement for Diabetes Disease Classification: A Case on a Public Medical Dataset,” Fuzzy Inf. Eng., vol. 9, no. 3, pp. 345–357, 2017.

[18] H. Wu, S. Yang, Z. Huang, J. He, and X. Wang, “Type 2 diabetes mellitus prediction model based on data mining,” Informatics Med. Unlocked, vol. 10, pp. 100–107, 2018..

[19] Sisodia and D. S. Sisodia, “Prediction of Diabetes using Classification Algorithms,” Procedia Comput. Sci., vol. 132, no. Iccids, pp. 1578 1585, 2018.

[2.0] O. Rep, “HHS Public Access,” vol. 4, no. 1, pp. 92–98, 2016.

[21]“Pima Indian Diabetes Dataset,” Uci Machine Learning http://archive.ics.uci.edu/ml/datasets/Pima+Indians+Diabetes. [Accessed: 15-Apr-2018].

[22] A.Idri, H. Benhar, J. L. Fernández-Alemán, and I. Kadi, “A systematic map of medical data preprocessing in knowledge discovery,” Comput. Methods Programs Biomed., vol. 162, pp. 69–85, 2018.

[23] P. Misra and A. Yadav, “Impact of Preprocessing Methods on Healthcare Predictions,” SSRN Electron. J., Jan. 2019.

[24] Avila, M. Bussonnier, S. Corlay, Brian Granger, and J. Grout, “Juypter Notebook with Ipython,” 2014. [Online]. Available: http://jupyter.org/install. [Accessed: 18-May-2018].

Downloads

Published

2024-03-30

Issue

Section

Articles

How to Cite

Comparative Analysis of Machine Learning Algorithms to Predict Type II Diabetes. (2024). International Journal of Multidisciplinary Research and Explorer, 4(1), 53-67. https://doi.org/10.70454/IJMRE.2024.40106