CASE STUDY

Outcome prediction for venous thromboembolism & cancer

Outcome Prediction in Critically-ill Patients with Venous Thromboembolism and/or Cancer Using Machine Learning Algorithms: External Validation and Comparison with Scoring Systems

Vasiliki Danilatou, Sphynx Technology Solutions, Zug, School of Medicine, European University of Cyprus, Nicosia; Stylianos Nikolakakis, School of Electrical and Computer Engineering, Technical University of Crete; Despoina Antonakaki, Christos Tzagkarakis, Dimitrios Mavroidis, Institute of Computer Science (ICS)-Foundation for Research and Technology-Hellas (FORTH); Theodoros Kostoulas, Department of Information and Communication Systems Engineering, School of Engineering, University of the Aegean; Sotirios Ioannidis, School of Electrical and Computer Engineering, Technical University of Crete, Institute of Computer Science (ICS)-Foundation for Research and Technology-Hellas (FORTH)

Digital Library: https://www.mdpi.com/1422-0067/23/13/7132

Abstract

Intensive care unit (ICU) patients with venous thromboembolism (VTE) and/or cancer suffer from high mortality rates. Mortality prediction in the ICU has been a major medical challenge for which several scoring systems exist but lack in specificity. This study focuses on two target groups, namely patients with thrombosis or cancer. The main goal is to develop and validate interpretable machine learning (ML) models to predict early and late mortality, while exploiting all available data stored in the medical record.

Methods: To this end, retrospective data from two freely accessible databases, MIMIC-III and eICU, were used. Well-established ML algorithms were implemented utilizing automated and purposely built ML frameworks for addressing class imbalance.

Results

Prediction of early mortality showed excellent performance in both disease categories, in terms of the area under the receiver operating characteristic curve (AUC–ROC): VTE-MIMIC-III 0.93, eICU 0.87, cancer-MIMIC-III 0.94. On the other hand, late mortality prediction showed lower performance, i.e., AUC–ROC: VTE 0.82, cancer 0.74–0.88. The predictive model of early mortality developed from 1651 VTE patients (MIMIC-III) ended up with a signature of 35 features and was externally validated in 2659 patients from the eICU dataset. The implemented model outperformed traditional scoring systems in predicting early as well as late mortality. Novel biomarkers, such as red cell distribution width, were identified.

Conclusions: Prediction of in-hospital mortality in patients with thrombosis or cancer is highly feasible, whereas prediction of late mortality is a more difficult and complex task. The results of this study are promising and, most importantly, interpretable, since the predictive features included in the model were clinically meaningful. The discovery of novel biomarkers, such as RDW and eosinophils, and their incorporation into the traditional clinical scores could possibly refine their performance.

How was JADBio used?

During this study, predictive models of early and late mortality for patients with thrombosis or cancer, were produced using Machine Learning. JADBio Auto ML tool was used as well as the custom ML framework XGBoost to addresses the class imbalance problem. Performance metrics, feature discriminative analysis, comparison with conventional scores, and validation of the model are reported in detail.

JADBio AutoML – Prediction of Early and Late Mortality in ICU Patients with Thrombosis

Detailed metrics of performance for prediction of early and late mortality of ICU patients with thrombosis using all features. (Table 4)

Custom Machine Learning Framework: XGBoost
Similarly to JADBio AutoML framework, among the various feature groups, the model that achieved the best performance was the one trained with the dataset containing all features.

Detailed metrics of performance for prediction of early mortality of ICU patients with thrombosis using all features. (Table 6)

Towards constructing a robust model, training of two different ML strategies has been employed, an AutoML and a custom approach based on the XGBoost algorithm [50]. The primary aim of using the two approaches was to address the class imbalance through an oversampling method. However, the accuracy appeared to remain constant despite the adoption of the SMOTE oversampling technique. Moreover, the experimentation and extraction of the best-performing ML model in the custom approach is time-consuming since it requires substantial human and computational effort, artificial intelligence expertise, and extensive tuning of hyper-parameters; for this reason, automated ML tools are becoming popular among non-specialists in this area.

OTHER

CASE STUDIES

Do you have questions?

JADBio can meet your needs. Ask one of our experts for an interactive demo.

Stay connected to get our news first!

Do you have questions?

JADBio can meet your needs. Ask one of our experts for an interactive demo.

JADai by JADBio
REQUEST A DEMO

Join the JADai Community!

Sign up with a FREE Basic plan! Be part of a growing community of AutoML enthusiasts

JADBio JADai