AutoML for Healthcare & Clinical Notes Analysis

JADBio recommended tool to use to identify diseases

JADBio was recently mentioned in the research paper Automated Machine Learning for Healthcare and Clinical Notes Analysis and recommended to identify diseases. The paper was published on MDPI, a platform for peer-reviewed, scientific open-access journals with the mission to foster open scientific exchange in all forms across all disciplines, supporting academic communities since 1996.

The researchers covered several autoML platforms ranging from Google’s AutoML, Auto-Sklearn, Auto-Keras, TPOT and JADBio. When it came to tools that were built specifically for healthcare industry, apart from JADBIO, they also looked at AutoPrognosis, which while free requires coding knowledge.

Image source

Regarding the JADBio platform the paper notes “…[JADBio] has built an AutoML system that is specialized in bioinformatic applications and translational medicine. Their platform, named just add data bio (JADBIO), also has built-in predictive and diagnostic clinical models. JADBIO works by using biosignatures of dataset features and can interpret and visualize results. It can work with a dataset including only small number of records, as few as 25, while being also capable of processing high-dimensional datasets of hundreds to thousands of features. Feature engineering, algorithm selection, and hyperparameter options and scopes are identified by algorithm and the hyperparameter space (AHPS) method. AHPS uses parameters such as dataset size, feature dimensionality, and targeted value type to identify a list of relevant algorithms, and to identify a list of methodologies for feature selection and data preprocessing, as well as to define hyperparameter scope.

AHPS’s output is fed to a configuration generator (CG) to generate a list of pipelines with available hyperparameters. Then, configuration evaluation protocol (CEP) uses k-fold cross validation to determine the best data preprocessing methods, feature engineering algorithms, and hyperparameters, and to assess the model’s performance. CEP selection is then applied to the original dataset and predictive models are built.

Image source

Figure 3 shows how JADBIO AutoML model works: Dataset meta features are fed into the algorithm and hyperparameters space (AHPS), which feeds the configuration generator (CG) with a list of feature selection and data preprocessing methodologies, relevant algorithms list, and hyperparameters scope, then configuration evaluation protocol (CEP) finds the best machine learning model with the best performance.

Tsamardinos et al. have used multiple ML algorithms for building JADBIO. These include linear ridge regression, SVM, decision tree (DT), random forests (RF), and Gaussian kernel SVMs. JADBIO has been also compared to Auto-Sklearn system using 748 datasets. Auto-Sklearn failed to process around 39.44% of the datasets due to timeout and internal errors, yet JADBIO’s performance was close to Auto-Sklearn’s performance for the remaining datasets“.

They proceed to mention Karaglani et all research on Alzheimer prediction from blood samples “Instead of using general AutoML platforms, some researchers have used the AutoML systems specifically built for medical data. For instance, Karaglani et al. [14] used JADBIO to diagnose Alzheimer disease using blood-based diagnostic biosignatures. Their datasets consisted of low-sample omics data with high-dimensional features. They tested seven datasets with different biosignatures: two metabolomic datasets, one proteomic dataset, and four transcriptomic datasets. Sample numbers were between 30 and 589, while the number of features was between 25 to 38,327 features. They used area under the curve (AUC) to evaluate predicted results and got AUC accuracy between 0.489 and 0.975, with an average AUC of 0.759”.

So how did JADBio score?

While they have included purpose-built medical AutoML solutions, like JADBio, they conducted their research and comparison on out-of-the-box solutions, like Google AutoML and Auto-Sklearn. They compare the efficiency of the platforms on structured and unstructured data like images. For JADBio more specifically they had to say

For a small number of records in high-dimensional datasets, JADBIO would be the recommended tool to use to identify diseases. For instance, it was used to identify Alzheimer and Parkinson diseases.

Automated Machine Learning for Healthcare and Clinical Notes Analysis

The paper is a validation for JADBio and all the work we’re doing in bringing the medical community a tool that can make their work faster and more efficient. Diseases and treatments can be discovered faster when clinicians and bioinformaticians have automated tools like JADBio in their hands. The paper adds us under the structured data category but we will very soon be announcing(upcoming months) the analysis of medical images also. Stay tuned for that by signing up for our newsletter or visit us here again.


Abstract
Machine learning (ML) has been slowly entering every aspect of our lives, and its positive impact has been astonishing. To accelerate embedding ML in more applications and incorporate it in real-world scenarios, automated machine learning (AutoML) is emerging. The primary purpose of AutoML is to provide seamless integration of ML in various industries, which will facilitate better outcomes in everyday tasks. AutoML has already been applied to more accessible settings with structured data such as tabular lab data in healthcare. However, there is still a need for applying AutoML for interpreting the medical text, which is being generated at a tremendous rate. For this to happen, a promising method is AutoML for clinical notes analysis, an unexplored research area representing an ML research gap. This paper aims to fill this gap and provide a comprehensive survey and analytical study towards AutoML for clinical notes. To that end, we first introduce AutoML technology and review its various tools and techniques. We then survey the literature of AutoML in the healthcare industry and discuss the developments specific to clinical settings and those using general AutoML tools for healthcare applications. With this background, we then discuss the challenges of working with clinical notes and highlight the benefits of developing AutoML for medical notes processing. Next, we survey relevant ML research for clinical notes and analyze the literature and the field of AutoML in the healthcare industry. Furthermore, we propose future research directions and shed light on the challenges and opportunities this emerging field holds. With this, we aim to assist the community with implementing an AutoML platform for medical notes, which, if realized, can revolutionize patient outcomes.

Find the full text here www.mdpi.com/2073-431X/10/2/24/htm

(The paper was published in the Special Issue Artificial Intelligence for Health)

JADBio makes it easy and affordable for health-data analysts and life science professionals to use data science to discover knowledge while reducing time and effort by combining a robust end-to-end machine learning platform with a wealth of capabilities.
Grab a FREE Basic plan here – What are you waiting for?