CASE STUDY

Feature Signature Discovery for Autism Detection

Feature Signature Discovery for Autism Detection: An Automated Machine Learning Based Feature Ranking Framework

Shomona Gracia Jacob, Bensujin Bennet, University of Technology and Applied Sciences, Nizwa; Majdi Mohammed Bait Ali Sulaiman, University of Technology and Applied Sciences, Salalah

Digital Library: https://doi.org/10.1155/2023/6330002

Abstract

Autism spectrum disorder is the most used umbrella term for a myriad of neuro-degenerative/developmental conditions typified by inappropriate social behavior, lack of communication/comprehension skills, and restricted mental and emotional maturity. The intriguing factor of this disorder is attributed to the fact that it can be detected only by close monitoring of developmental milestones after childbirth. Moreover, the exact causes for the occurrence of this neurodevelopmental condition are still unknown. Besides, autism is prevalent across individuals irrespective of ethnicity, genetic/familial history, and economic/educational background. Although research suggests that autism is genetic in nature and early detection of this disorder can greatly enhance the independent lifestyle and societal adaptability of affected individuals, there is still a great dearth of information to support the statement of proven facts and figures. This research work places emphasis on the application of automated machine learning incorporated with feature ranking techniques to generate significant feature signatures for the early detection of autism. Publicly available datasets based on the Q-chat scores of individuals across diverse age groups—toddlers, children, adolescents, and adults have been employed in this study. A machine learning framework based on automated hyperparameter optimization is proposed in this work to rank the potential nonclinical markers for autism. Moreover, this study aimed at ranking the AutoML models based on Mathew’s correlation coefficient and balanced accuracy via which nonclinical markers were identified from these datasets. Besides, the feature signatures and their significance in distinguishing between classes are being reported for the first time in autism detection.

Methods: 

The development of a machine learning-based computational framework that would reveal the potential nonclinical markers for Autism Spectrum Disorders (ASD) would enable even a medical inexpert to identify the possibility of autism in their ward and seek early medical advice. This research work focuses on achieving the following main objectives: (i) propose an AutoML-based computational framework that combines the best feature ranking and classification approach to generate high classification accuracy. (ii) Identify the role of potential nonclinical markers in the order of increasing importance (feature signatures) that can detect autism with minimal, yet significant information. (iii) Compare the use of traditional, deep learning, and AutoML techniques in classifying autism from nonclinical data.

This study concentrates on four autism datasets (i) the child autism dataset—UCI, (ii) the adolescent autism dataset—UCI, (iii) the adult dataset—UCI, and (iv) the toddler dataset—Kaggle (click here for a summary of the description of the attributes of toddlers).

One of the objectives of this research is to find the most crucial questions/observations that could lead to early and accurate detection of autism in a noninvasive and less stigmatic manner.

Results

The proposed framework yielded ∼90% MCC and ∼95% balanced accuracy across all four age groups of autism datasets. Deep learning approaches have yielded a maximum of 92.7% accuracy on the same datasets but are limited in their ability to extract significant markers, have not reported on MCC for unbalanced data, and cannot adapt automatically to new data entries. However, AutoML approaches are more flexible, easier to implement, and provide automated optimization, thereby yielding the highest accuracy with minimal user intervention.

Conclusions: This research work fuses the competence of automated machine learning and computational intelligence to discover highly predictive features for autism that would enable possible early detection of the disorder.

How was JADBio used?

The analysis of the model began with feature selection. AutoML in JADBio implemented the SES (statistically equivalent signature) and LASSO (least absolute shrinkage and selection operator) methods.

Feature selection was followed by classification and the best performing model was SVM of type C-SVC using a linear/polynomial/radial basis function kernel with cost and gamma hyperparameters set to 1.

The comparative study of results in Table 5 clearly reveals that the AutoML methods are most suitable for improving predictor performance on balanced and unbalanced data.

Comparative study on the performance of AutoML models with previous work

Comparative study on the performance of AutoML models with previous work (Table 5)

OTHER

CASE STUDIES

Do you have questions?

JADBio can meet your needs. Ask one of our experts for an interactive demo.

Stay connected to get our news first!

Do you have questions?

JADBio can meet your needs. Ask one of our experts for an interactive demo.

JADai by JADBio
REQUEST A DEMO

Join the JADai Community!

Sign up with a FREE Basic plan! Be part of a growing community of AutoML enthusiasts

JADBio JADai