Analysis and Classification of Autism Data Using Machine Learning Algorithms
DOI:
https://doi.org/10.25271/sjuoz.2022.10.4.1036Keywords:
Autism, Supervised Learning, Classification, Artificial Neural Networks, PCA, SMOTEAbstract
Autism is a neurodevelopmental disorder that affects children worldwide between the ages of 2 and 8 years. Children with autism have communication and social difficulties, and the current standardized clinical diagnosis of autism still relies on behaviour-based tests. The rapidly growing number of autistic patients in the Kurdistan Region of Iraq necessitates. However, such data are scarce, making extensive evaluations of autism screening procedures more difficult. For this purpose, the use of machine learning algorithms for this disease to assist health practitioners if formal clinical diagnosis should be pursued was investigated. Data from 515 patients were collected in Dohuk city related to autism screening for young children. Three classification algorithms, namely (DT, KNN, and ANN) were applied to diagnose and predict autism using various rating scales. Before applying the above classifiers, the newly obtained data set was in different ways undergo data reprocessing. Since our data is unbalanced with high dimensionality, we suggest combining SMOTE (Synthetic Minority Hyper sampling Technique) and PCA (Primary Component Analysis) to improve the performance of classification models. Experimental results showed that the combination of PCA and SMOTE methods improved classification performance. Moreover, ANN exceeded the other models in terms of accuracy and F1 score, suggesting that these classification methods could be used to diagnose autism in the future.
References
W. Jamal, S. Das, I. A. Oprescu, K. Maharatna, F. Apicella, and F. Sicca, “Classification of autism spectrum disorder using supervised learning of brain connectivity measures extracted from synchrostates,” J. Neural Eng., vol. 11, no. 4, pp. 1–27, 2014, doi: 10.1088/1741-2560/11/4/046019.
P. Adak, S. Sinha, and N. Banerjee, “An Association Study of Gamma-Aminobutyric Acid Type A Receptor Variants and Susceptibility to Autism Spectrum Disorders,” J. Autism Dev. Disord., vol. 51, no. 11, pp. 4043–4053, 2021, doi: 10.1007/s10803-020-04865-x.
C. Chen, L. Geng, and S. Zhou, “Design and implementation of bank CRM system based on decision tree algorithm,” Neural Comput. Appl., vol. 33, no. 14, pp. 8237–8247, 2021, doi: 10.1007/s00521-020-04959-8.
G. A. A. MULLA, Y. DEMİR, and M. HASSAN, “Combination of PCA with SMOTE Oversampling for Classification of High-Dimensional Imbalanced Data,” Bitlis Eren Üniversitesi Fen Bilim. Derg., vol. 10, no. 3, pp. 858–869, 2021, doi: 10.17798/bitlisfen.939733.
E. M. Senan et al., “Diagnosis of Chronic Kidney Disease Using Effective Classification Algorithms and Recursive Feature Elimination Techniques,” J. Healthc. Eng., vol. 2021, 2021, doi: 10.1155/2021/1004767.
M. M. Hassan, N. Njmh Amiri, M. Muhammed Hassan, and N. Amiri, “Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms Bayesian Deep Learning View project Technical SCIENCE ⚙ View project Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms,” no. October, 2019, [Online]. Available: https://www.researchgate.net/publication/336672231.
M. M. Rahman, O. L. Usman, R. C. Muniyandi, S. Sahran, S. Mohamed, and R. A. Razak, “A review of machine learning methods of feature selection and classification for autism spectrum disorder,” Brain Sci., vol. 10, no. 12, pp. 1–23, 2020, doi: 10.3390/brainsci10120949.
Y. Zheng, T. Deng, and Y. Wang, “Autism Classification Based on Logistic Regression Model,” 2021 IEEE 2nd Int. Conf. Big Data, Artif. Intell. Internet Things Eng. ICBAIE 2021, no. Icbaie, pp. 579–582, 2021, doi: 10.1109/ICBAIE52039.2021.9389914.
K. Niu et al., “Multichannel Deep Attention Neural Networks for the Classification of Autism Spectrum Disorder Using Neuroimaging and Personal Characteristic Data,” Complexity, vol. 2020, 2020, doi: 10.1155/2020/1357853.
D. Arya et al., “Fusing Structural and Functional MRIs using Graph Convolutional Networks for Autism Classification,” Proc. Mach. Learn. Res., vol. 121, pp. 1–17, 2020, [Online]. Available: https://proceedings.mlr.press/v121/arya20a.html.
S. Raj and S. Masood, “Analysis and Detection of Autism Spectrum Disorder Using Machine Learning Techniques,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 994–1004, 2020, doi: 10.1016/j.procs.2020.03.399.
A. A. Abdullah, S. Rijal, and S. R. Dash, “Evaluation on Machine Learning Algorithms for Classification of Autism Spectrum Disorder (ASD),” J. Phys. Conf. Ser., vol. 1372, no. 1, 2019, doi: 10.1088/1742-6596/1372/1/012052.
R. Tejwani, A. Liska, H. You, J. Reinen, and P. Das, “Autism Classification Using Brain Functional Connectivity Dynamics and Machine Learning,” no. December 2017, 2017, [Online]. Available: http://arxiv.org/abs/1712.08041.
S. J. Rogers et al., “A Multisite Randomized Controlled Trial Comparing the Effects of Intervention Intensity and Intervention Style on Outcomes for Young Children With Autism,” J. Am. Acad. Child Adolesc. Psychiatry, vol. 60, no. 6, pp. 710–722, 2021, doi: 10.1016/j.jaac.2020.06.013.
J. B. McCauley, R. Elias, and C. Lord, “Trajectories of co-occurring psychopathology symptoms in autism from late childhood to adulthood,” Dev. Psychopathol., vol. 32, no. 4, pp. 1287–1302, 2020, doi: 10.1017/S0954579420000826.
M. L. Matson, S. Mahan, and J. L. Matson, “Parent training: A review of methods for children with autism spectrum disorders,” Res. Autism Spectr. Disord., vol. 3, no. 4, pp. 868–875, 2009, doi: 10.1016/j.rasd.2009.02.003.
X. Wei, L. Zhang, H. Q. Yang, L. Zhang, and Y. P. Yao, “Machine learning for pore-water pressure time-series prediction: Application of recurrent neural networks,” Geosci. Front., vol. 12, no. 1, pp. 453–467, 2021, doi: 10.1016/j.gsf.2020.04.011.
O. D. Madeeh and H. S. Abdullah, “An Efficient Prediction Model based on Machine Learning Techniques for Prediction of the Stock Market,” J. Phys. Conf. Ser., vol. 1804, no. 1, 2021, doi: 10.1088/1742-6596/1804/1/012008.
M. M. Hassan, N. Njmh Amiri, M. Muhammed Hassan, and N. Amiri, “c Technical SCIENCE ⚙ View project Classification of Imbalanced Data of Diabetes Disease Using Machine Learning Algorithms,” no. October 2019, [Online]. Available: https://www.researchgate.net/publication/336672231.
A. B. Al-Ghamdi, S. Kamel, and M. Khayyat, “Evaluation of Artificial Neural Networks Performance Using Various Normalization Methods for Water Demand Forecasting,” Proc. - 2021 IEEE 4th Natl. Comput. Coll. Conf. NCCC 2021, 2021, doi: 10.1109/NCCC49330.2021.9428856.
K. H. Abdulkareem et al., “Realizing an Effective COVID-19 Diagnosis System Based on Machine Learning and IoT in Smart Hospital Environment,” IEEE Internet Things J., vol. 8, no. 21, pp. 15919–15928, 2021, doi: 10.1109/JIOT.2021.3050775.
R. Zebari, A. Abdulazeez, D. Zeebaree, D. Zebari, and J. Saeed, “A Comprehensive Review of Dimensionality Reduction Techniques for Feature Selection and Feature Extraction,” J. Appl. Sci. Technol. Trends, vol. 1, no. 2, pp. 56–70, 2020, doi: 10.38094/jastt1224.
K. Kaplan, Y. Kaya, M. Kuncan, M. R. Mi̇naz, and H. M. Ertunç, “An improved feature extraction method using texture analysis with LBP for bearing fault diagnosis,” Appl. Soft Comput. J., vol. 87, 2020, doi: 10.1016/j.asoc.2019.106019.
C. L. Chowdhary and D. P. Acharjya, “Segmentation and Feature Extraction in Medical Imaging: A Systematic Review,” Procedia Comput. Sci., vol. 167, no. 2019, pp. 26–36, 2020, doi: 10.1016/j.procs.2020.03.179.
M. Asadur Rahman, M. Foisal Hossain, M. Hossain, and R. Ahmmed, “Employing PCA and t-statistical approach for feature extraction and classification of emotion from multichannel EEG signal,” Egypt. Informatics J., vol. 21, no. 1, pp. 23–35, 2020, doi: 10.1016/j.eij.2019.10.002.
A. K. Gárate-Escamila, A. Hajjam El Hassani, and E. Andrès, “Classification models for heart disease prediction using feature selection and PCA,” Informatics Med. Unlocked, vol. 19, p. 100330, 2020, doi: 10.1016/j.imu.2020.100330.
M. Pouyap, L. Bitjoka, E. Mfoumou, and D. Toko, “Improved Bearing Fault Diagnosis by Feature Extraction Based on GLCM, Fusion of Selection Methods, and Multiclass-Naïve Bayes Classification,” J. Signal Inf. Process., vol. 12, no. 04, pp. 71–85, 2021, doi: 10.4236/jsip.2021.124004.
A. J. Mohammed, “Improving Classification Performance for a Novel Imbalanced Medical Dataset using SMOTE
Method,” Int. J. Adv. Trends Comput. Sci. Eng., vol. 9, no. 3, pp. 3161–3172, 2020, doi: 10.30534/ijatcse/2020/104932020.
M. M. Hassan, A. S. Eesa, A. J. Mohammed, and W. K. Arabo, “Oversampling method based on gaussian distribution and K-means clustering,” Comput. Mater. Contin., vol. 69, no. 1, pp. 451–469, 2021, doi: 10.32604/cmc.2021.018280.
A. D. Sappa and F. Dornaika, An Edge-Based Approach to Motion Detection Conference, vol. 11538, no. May. 2019.
A. A. Salih and A. M. Abdulazeez, “Evaluation of Classification Algorithms for Intrusion Detection System: A Review,” J. Soft Comput. Data Min., vol. 02, no. 01, pp. 31–40, 2021, doi: 10.30880/jscdm.2021.02.01.004.
G. Chen and J. Chen, “A novel wrapper method for feature selection and its applications,” Neurocomputing, vol. 159, no. 1, pp. 219–226, 2015, doi: 10.1016/j.neucom.2015.01.070.
A. Ghosh and R. Maiti, “Soil erosion susceptibility assessment using logistic regression, decision tree and random forest: study on the Mayurakshi river basin of Eastern India,” Environ. Earth Sci., vol. 80, no. 8, pp. 1–16, 2021, doi: 10.1007/s12665-021-09631-5.
B. Charbuty and A. Abdulazeez, “Classification Based on Decision Tree Algorithm for Machine Learning,” J. Appl. Sci. Technol. Trends, vol. 2, no. 01, pp. 20–28, 2021, doi: 10.38094/jastt20165.
Z. Yang, C. Kong, Y. Wang, X. Rong, and L. Wei, “Fault diagnosis of mine asynchronous motor based on MEEMD energy entropy and ANN,” Comput. Electr. Eng., vol. 92, no. March, p. 107070, 2021, doi: 10.1016/j.compeleceng.2021.107070.
R. Bala and D. Kumar, “Classification Using ANN: A Review,” Int. J. Comput. Intell. Res., vol. 13, no. 7, pp. 1811–1820, 2017, [Online]. Available: http://www.ripublication.com.
N. Yulias et al., “JurnalMantik,” vol. 4, no. 4, pp. 2599–2603, 2021.
F. Itoo, Meenakshi, and S. Singh, “Comparison and analysis of logistic regression, Naïve Bayes and KNN machine learning algorithms for credit card fraud detection,” Int. J. Inf. Technol., vol. 13, no. 4, pp. 1503–1511, 2021, doi: 10.1007/s41870-020-00430-y.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 Masoud Muhammed Hassan, Sulav Adil Taher
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License [CC BY-NC-SA 4.0] that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work, with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online.