Analysis and Classification of Autism Data Using Machine Learning Algorithms

Authors

  • Masoud Muhammed Hassan Dept. of Computer Science, University of Zakho, Duhok, 42001, Kurdistan Region, Iraq
  • Sulav Adil Taher Dept. of Statistics, University of Duhok, Duhok, Kurdistan Region, Iraq

DOI:

https://doi.org/10.25271/sjuoz.2022.10.4.1036

Keywords:

Autism, Supervised Learning, Classification, Artificial Neural Networks, PCA, SMOTE

Abstract

Autism is a neurodevelopmental disorder that affects children worldwide between the ages of 2 and 8 years. Children with autism have communication and social difficulties, and the current standardized clinical diagnosis of autism still relies on behaviour-based tests. The rapidly growing number of autistic patients in the Kurdistan Region of Iraq necessitates. However, such data are scarce, making extensive evaluations of autism screening procedures more difficult. For this purpose, the use of machine learning algorithms for this disease to assist health practitioners if formal clinical diagnosis should be pursued was investigated. Data from 515 patients were collected in Dohuk city related to autism screening for young children. Three classification algorithms, namely (DT, KNN, and ANN) were applied to diagnose and predict autism using various rating scales. Before applying the above classifiers, the newly obtained data set was in different ways undergo data reprocessing. Since our data is unbalanced with high dimensionality, we suggest combining SMOTE (Synthetic Minority Hyper sampling Technique) and PCA (Primary Component Analysis) to improve the performance of classification models. Experimental results showed that the combination of PCA and SMOTE methods improved classification performance. Moreover, ANN exceeded the other models in terms of accuracy and F1 score, suggesting that these classification methods could be used to diagnose autism in the future.

Author Biographies

Masoud Muhammed Hassan, Dept. of Computer Science, University of Zakho, Duhok, 42001, Kurdistan Region, Iraq

Dept. of Computer Science, University of Zakho, Duhok, 42001, Kurdistan Region, Iraq (masoud.hassan@uoz.edu.krd)

Sulav Adil Taher, Dept. of Statistics, University of Duhok, Duhok, Kurdistan Region, Iraq

Dept. of Statistics, University of Duhok, Duhok, Kurdistan Region, Iraq – (sulav.adeel@gmail.com)

References

W. Jamal, S. Das, I. A. Oprescu, K. Maharatna, F. Apicella, and F. Sicca, “Classification of autism spectrum disorder using supervised learning of brain connectivity measures extracted from synchrostates,” J. Neural Eng., vol. 11, no. 4, pp. 1–27, 2014, doi: 10.1088/1741-2560/11/4/046019.

P. Adak, S. Sinha, and N. Banerjee, “An Association Study of Gamma-Aminobutyric Acid Type A Receptor Variants and Susceptibility to Autism Spectrum Disorders,” J. Autism Dev. Disord., vol. 51, no. 11, pp. 4043–4053, 2021, doi: 10.1007/s10803-020-04865-x.

C. Chen, L. Geng, and S. Zhou, “Design and implementation of bank CRM system based on decision tree algorithm,” Neural Comput. Appl., vol. 33, no. 14, pp. 8237–8247, 2021, doi: 10.1007/s00521-020-04959-8.

G. A. A. MULLA, Y. DEMİR, and M. HASSAN, “Combination of PCA with SMOTE Oversampling for Classification of High-Dimensional Imbalanced Data,” Bitlis Eren Üniversitesi Fen Bilim. Derg., vol. 10, no. 3, pp. 858–869, 2021, doi: 10.17798/bitlisfen.939733.

E. M. Senan et al., “Diagnosis of Chronic Kidney Disease Using Effective Classification Algorithms and Recursive Feature Elimination Techniques,” J. Healthc. Eng., vol. 2021, 2021, doi: 10.1155/2021/1004767.

M. M. Hassan, N. Njmh Amiri, M. Muhammed Hassan, and N. Amiri, “Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms Bayesian Deep Learning View project Technical SCIENCE ⚙ View project Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms,” no. October, 2019, [Online]. Available: https://www.researchgate.net/publication/336672231.

M. M. Rahman, O. L. Usman, R. C. Muniyandi, S. Sahran, S. Mohamed, and R. A. Razak, “A review of machine learning methods of feature selection and classification for autism spectrum disorder,” Brain Sci., vol. 10, no. 12, pp. 1–23, 2020, doi: 10.3390/brainsci10120949.

Y. Zheng, T. Deng, and Y. Wang, “Autism Classification Based on Logistic Regression Model,” 2021 IEEE 2nd Int. Conf. Big Data, Artif. Intell. Internet Things Eng. ICBAIE 2021, no. Icbaie, pp. 579–582, 2021, doi: 10.1109/ICBAIE52039.2021.9389914.

K. Niu et al., “Multichannel Deep Attention Neural Networks for the Classification of Autism Spectrum Disorder Using Neuroimaging and Personal Characteristic Data,” Complexity, vol. 2020, 2020, doi: 10.1155/2020/1357853.

D. Arya et al., “Fusing Structural and Functional MRIs using Graph Convolutional Networks for Autism Classification,” Proc. Mach. Learn. Res., vol. 121, pp. 1–17, 2020, [Online]. Available: https://proceedings.mlr.press/v121/arya20a.html.

S. Raj and S. Masood, “Analysis and Detection of Autism Spectrum Disorder Using Machine Learning Techniques,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 994–1004, 2020, doi: 10.1016/j.procs.2020.03.399.

A. A. Abdullah, S. Rijal, and S. R. Dash, “Evaluation on Machine Learning Algorithms for Classification of Autism Spectrum Disorder (ASD),” J. Phys. Conf. Ser., vol. 1372, no. 1, 2019, doi: 10.1088/1742-6596/1372/1/012052.

R. Tejwani, A. Liska, H. You, J. Reinen, and P. Das, “Autism Classification Using Brain Functional Connectivity Dynamics and Machine Learning,” no. December 2017, 2017, [Online]. Available: http://arxiv.org/abs/1712.08041.

S. J. Rogers et al., “A Multisite Randomized Controlled Trial Comparing the Effects of Intervention Intensity and Intervention Style on Outcomes for Young Children With Autism,” J. Am. Acad. Child Adolesc. Psychiatry, vol. 60, no. 6, pp. 710–722, 2021, doi: 10.1016/j.jaac.2020.06.013.

J. B. McCauley, R. Elias, and C. Lord, “Trajectories of co-occurring psychopathology symptoms in autism from late childhood to adulthood,” Dev. Psychopathol., vol. 32, no. 4, pp. 1287–1302, 2020, doi: 10.1017/S0954579420000826.

M. L. Matson, S. Mahan, and J. L. Matson, “Parent training: A review of methods for children with autism spectrum disorders,” Res. Autism Spectr. Disord., vol. 3, no. 4, pp. 868–875, 2009, doi: 10.1016/j.rasd.2009.02.003.

X. Wei, L. Zhang, H. Q. Yang, L. Zhang, and Y. P. Yao, “Machine learning for pore-water pressure time-series prediction: Application of recurrent neural networks,” Geosci. Front., vol. 12, no. 1, pp. 453–467, 2021, doi: 10.1016/j.gsf.2020.04.011.

O. D. Madeeh and H. S. Abdullah, “An Efficient Prediction Model based on Machine Learning Techniques for Prediction of the Stock Market,” J. Phys. Conf. Ser., vol. 1804, no. 1, 2021, doi: 10.1088/1742-6596/1804/1/012008.

M. M. Hassan, N. Njmh Amiri, M. Muhammed Hassan, and N. Amiri, “c Technical SCIENCE ⚙ View project Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms,” no. October 2019, [Online]. Available: https://www.researchgate.net/publication/336672231.

A. B. Al-Ghamdi, S. Kamel, and M. Khayyat, “Evaluation of Artificial Neural Networks Performance Using Various Normalization Methods for Water Demand Forecasting,” Proc. - 2021 IEEE 4th Natl. Comput. Coll. Conf. NCCC 2021, 2021, doi: 10.1109/NCCC49330.2021.9428856.

K. H. Abdulkareem et al., “Realizing an Effective COVID-19 Diagnosis System Based on Machine Learning and IoT in Smart Hospital Environment,” IEEE Internet Things J., vol. 8, no. 21, pp. 15919–15928, 2021, doi: 10.1109/JIOT.2021.3050775.

R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” J. Appl. Sci. Technol. Trends, vol. 1, no. 2, pp. 56–70, 2020, doi: 10.38094/jastt1224.

K. Kaplan, Y. Kaya, M. Kuncan, M. R. Mi̇naz, and H. M. Ertunç, “An improved feature extraction method using texture analysis with LBP for bearing fault diagnosis,” Appl. Soft Comput. J., vol. 87, 2020, doi: 10.1016/j.asoc.2019.106019.

C. L. Chowdhary and D. P. Acharjya, “Segmentation and Feature Extraction in Medical Imaging: A Systematic Review,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 26–36, 2020, doi: 10.1016/j.procs.2020.03.179.

M. Asadur Rahman, M. Foisal Hossain, M. Hossain, and R. Ahmmed, “Employing PCA and t-statistical approach for feature extraction and classification of emotion from multichannel EEG signal,” Egypt. Informatics J., vol. 21, no. 1, pp. 23–35, 2020, doi: 10.1016/j.eij.2019.10.002.

A. K. Gárate-Escamila, A. Hajjam El Hassani, and E. Andrès, “Classification models for heart disease prediction using feature selection and PCA,” Informatics Med. Unlocked, vol. 19, p. 100330, 2020, doi: 10.1016/j.imu.2020.100330.

M. Pouyap, L. Bitjoka, E. Mfoumou, and D. Toko, “Improved Bearing Fault Diagnosis by Feature Extraction Based on GLCM, Fusion of Selection Methods, and Multiclass-Naïve Bayes Classification,” J. Signal Inf. Process., vol. 12, no. 04, pp. 71–85, 2021, doi: 10.4236/jsip.2021.124004.

A. J. Mohammed, “Improving Classification Performance for a Novel Imbalanced Medical Dataset using SMOTE

Method,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 9, no. 3, pp. 3161–3172, 2020, doi: 10.30534/ijatcse/2020/104932020.

M. M. Hassan, A. S. Eesa, A. J. Mohammed, and W. K. Arabo, “Oversampling method based on gaussian distribution and K-means clustering,” Comput. Mater. Contin., vol. 69, no. 1, pp. 451–469, 2021, doi: 10.32604/cmc.2021.018280.

A. D. Sappa and F. Dornaika, An Edge-Based Approach to Motion Detection Conference, vol. 11538, no. May. 2019.

A. A. Salih and A. M. Abdulazeez, “Evaluation of Classification Algorithms for Intrusion Detection System: A Review,” J. Soft Comput. Data Min., vol. 02, no. 01, pp. 31–40, 2021, doi: 10.30880/jscdm.2021.02.01.004.

G. Chen and J. Chen, “A novel wrapper method for feature selection and its applications,” Neurocomputing, vol. 159, no. 1, pp. 219–226, 2015, doi: 10.1016/j.neucom.2015.01.070.

A. Ghosh and R. Maiti, “Soil erosion susceptibility assessment using logistic regression, decision tree and random forest: study on the Mayurakshi river basin of Eastern India,” Environ. Earth Sci., vol. 80, no. 8, pp. 1–16, 2021, doi: 10.1007/s12665-021-09631-5.

B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165.

Z. Yang, C. Kong, Y. Wang, X. Rong, and L. Wei, “Fault diagnosis of mine asynchronous motor based on MEEMD energy entropy and ANN,” Comput. Electr. Eng., vol. 92, no. March, p. 107070, 2021, doi: 10.1016/j.compeleceng.2021.107070.

R. Bala and D. Kumar, “Classification Using ANN: A Review,” Int. J. Comput. Intell. Res., vol. 13, no. 7, pp. 1811–1820, 2017, [Online]. Available: http://www.ripublication.com.

N. Yulias et al., “JurnalMantik,” vol. 4, no. 4, pp. 2599–2603, 2021.

F. Itoo, Meenakshi, and S. Singh, “Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection,” Int. J. Inf. Technol., vol. 13, no. 4, pp. 1503–1511, 2021, doi: 10.1007/s41870-020-00430-y.

Downloads

Published

2022-11-07

How to Cite

Hassan, M. M., & Taher, S. A. (2022). Analysis and Classification of Autism Data Using Machine Learning Algorithms. Science Journal of University of Zakho, 10(4), 206–212. https://doi.org/10.25271/sjuoz.2022.10.4.1036

Issue

Section

Science Journal of University of Zakho