洪涛

中国医学科学院阜外医院深圳医院 心血管外科

Development and validation of a machine learning-based prognostic risk stratification model for acute ischemic stroke.

Acute ischemic stroke (AIS) is a most prevalent cause of serious long-term disability worldwide. Accurate prediction of stroke prognosis is highly valuable for effective intervention and treatment. As such, the present retrospective study aims to provide a reliable machine learning-based model for prognosis prediction in AIS patients. Data from AIS patients were collected retrospectively from the Second Affiliated Hospital of Xuzhou Medical University between August 2017 and July 2019. Independent prognostic factors were identified by univariate and multivariate logistic analysis and used to develop machine learning (ML) models. The ML model performance was assessed by area under the receiver operating characteristic curve (AUC) and radar plot. Shapley Additive explanations (SHAP) values were used to interpret the importance of all features included in the predictive model. A total of 677 AIS patients were included in the present study. Poor prognosis was observed in 209 patients (30.9%). Six variables, including neuron specific enolase (NSE), homocysteine (HCY), S-100β, dysphagia, C-reactive protein (CRP), and anticoagulation were included to establish ML models. Six different ML algorithms were tested, and Random Forest model was selected as the final predictive model with the greatest AUC of 0.908. Moreover, according to SHAP results, NSE impacted the predictive model the most, followed by HCY, S-100β, dysphagia, CRP and anticoagulation. Based on the RF model, an online tool was constructed to predict the prognosis of AIS patients and assist clinicians in optimizing patient treatment. The present study revealed that NSE, HCY, CRP, S-100β, anticoagulation, and dysphagia were important factors for poor prognosis in AIS patients. ML algorithms were used to develop predictive models for predicting the prognosis of AIS patients, with the RF model presenting the optimal performance.

4.6
2区

Scientific reports 2023

Incorporation of a machine learning pathological diagnosis algorithm into the thyroid ultrasound imaging data improves the diagnosis risk of malignant thyroid nodules.

Objective:This study aimed at establishing a new model to predict malignant thyroid nodules using machine learning algorithms.Methods:A retrospective study was performed on 274 patients with thyroid nodules who underwent fine-needle aspiration (FNA) cytology or surgery from October 2018 to 2020 in Xianyang Central Hospital. The least absolute shrinkage and selection operator (lasso) regression analysis and logistic analysis were applied to screen and identified variables. Six machine learning algorithms, including Decision Tree (DT), Extreme Gradient Boosting (XGBoost), Gradient Boosting Machine (GBM), Naive Bayes Classifier (NBC), Random Forest (RF), and Logistic Regression (LR), were employed and compared in constructing the predictive model, coupled with preoperative clinical characteristics and ultrasound features. Internal validation was performed by using 10-fold cross-validation. The performance of the model was measured by the area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, F1 score, Shapley additive explanations (SHAP) plot, feature importance, and correlation of features. The best cutoff value for risk stratification was identified by probability density function (PDF) and clinical utility curve (CUC).Results:The malignant rate of thyroid nodules in the study cohort was 53.2%. The predictive models are constructed by age, margin, shape, echogenic foci, echogenicity, and lymph nodes. The XGBoost model was significantly superior to any one of the machine learning models, with an AUC value of 0.829. According to the PDF and CUC, we recommended that 51% probability be used as a threshold for determining the risk stratification of malignant nodules, where about 85.6% of patients with malignant nodules could be detected. Meanwhile, approximately 89.8% of unnecessary biopsy procedures would be saved. Finally, an online web risk calculator has been built to estimate the personal likelihood of malignant thyroid nodules based on the best-performing ML-ed model of XGBoost.Conclusions:Combining clinical characteristics and features of ultrasound images, ML algorithms can achieve reliable prediction of malignant thyroid nodules. The online web risk calculator based on the XGBoost model can easily identify in real-time the probability of malignant thyroid nodules, which can assist clinicians to formulate individualized management strategies for patients.

4.7
3区

Frontiers in oncology 2022

Development of a Machine Learning-Based Predictive Model for Lung Metastasis in Patients With Ewing Sarcoma.

Background:This study aimed to develop and validate machine learning (ML)-based prediction models for lung metastasis (LM) in patients with Ewing sarcoma (ES), and to deploy the best model as an open access web tool.Methods:We retrospectively analyzed data from the Surveillance Epidemiology and End Results (SEER) Database from 2010 to 2016 and from four medical institutions to develop and validate predictive models for LM in patients with ES. Patient data from the SEER database was used as the training group (n = 929). Using demographic and clinicopathologic variables six ML-based models for predicting LM were developed, and internally validated using 10-fold cross validation. All ML-based models were subsequently externally validated using multiple data from four medical institutions (the validation group, n = 51). The predictive power of the models was evaluated by the area under receiver operating characteristic curve (AUC). The best-performing model was used to produce an online tool for use by clinicians to identify ES patients at risk from lung metastasis, to improve decision making and optimize individual treatment.Results:The study cohort consisted of 929 patients from the SEER database and 51 patients from multiple medical centers, a total of 980 ES patients. Of these, 175 (18.8%) had lung metastasis. Multivariate logistic regression analysis was performed with survival time, T-stage, N-stage, surgery, and bone metastasis providing the independent predictive factors of LM. The AUC value of six predictive models ranged from 0.585 to 0.705. The Random Forest (RF) model (AUC = 0.705) using 4 variables was identified as the best predictive model of LM in ES patients and was employed to construct an online tool to assist clinicians in optimizing patient treatment. (https://share.streamlit.io/liuwencai123/es_lm/main/es_lm.py).Conclusions:Machine learning were found to have utility for predicting LM in patients with Ewing sarcoma, and the RF model gave the best performance. The accessibility of the predictive model as a web-based tool offers clear opportunities for improving the personalized treatment of patients with ES.

3.9
3区

Frontiers in medicine 2022

Development and Validation of a Novel Clinical Prediction Model to Predict the Risk of Lung Metastasis from Ewing Sarcoma for Medical Human-Computer Interface.

Background:This study aimed at establishing and validating a quantitative and visual prognosis model of Ewing Sarcoma (E.S.) via a nomogram. This model was developed to predict the risk of lung metastasis (L.M.) in patients with E.S. to provide a practical tool and help in clinical diagnosis and treatment.Methods:Data of all patients diagnosed with Ewing sarcoma between 2010 and 2016 were retrospectively retrieved from the Surveillance, Epidemiology, and End Results (SEER) database. A training dataset from the enrolled cohorts was built (n = 929). Predictive factors for L.M. were identified based on the results of multivariable logistic regression analyses. A nomogram model and a web calculator were constructed based on those key predictors. A multicenter dataset from four medical institutions was established for model validation (n = 51). The predictive ability of the nomogram model was evaluated by the receiver operating characteristic (ROC) curve and calibration plot. Decision curve analysis (DCA) was applied to explain the accuracy of the nomogram model in clinical practice.Results:Five independent factors, including survival time, surgery, tumor (T) stage, node (N) stage, and bone metastasis, were identified to develop a nomogram model. Internal and external validation indicated significant predictive discrimination: the area under the ROC curve (AUC) value was 0.769 (95% CI: 0.740 to 0.795) in the training cohort and 0.841 (95% CI: 0.712 to 0.929) in the validation cohort, respectively. Calibration plots and DCA presented excellent performance of the nomogram model with great clinical utility.Conclusions:In this study, a nomogram model was constructed and validated to predict L.M. in patients with E.S. for medical human-computer interface-a web calculator (https://drliwenle.shinyapps.io/LMESapp/). This practical tool could help clinicians make better decisions to provide precision prognosis and treatment for patients with E.S.

3区

Computational intelligence and neuroscience 2022