Abstract:
Objective To explore the clinical value of a combined model based on imaging and clinical signs, and CT radiomics score in predicting spread through air space (STAS) in lung adenocarcinoma and to conduct a visual analysis of the model using the shapley additive explanations (SHAP) method.
Methods Clinical data, CT plain scan images, and surgical pathological data of 176 patients with lung adenocarcinoma who were treated at Huainan Yangguang Xinkang Hospital from November 2020 to March 2024 were retrospectively studied. Among them, 96 were male, and 80 were female, with an average age of (62.1±10.8) years. The patients were divided into the STAS positive group and the STAS negative group and were randomly divided into training and validation groups at a ratio of 7∶3 using the random number table method. CT radiomics features of the tumor body, including areas 3 and 5 mm around the tumor, were extracted, and the radiomics score was calculated using the Elastic-Logistic regression analyse. The differences between groups with normally distributed measurement data were compared using the independent sample t-test, the differences between groups with non-normally distributed measurement data were compared using Mann-Whitney U test, and count data were compared using the chi-square test. Univariate and multivariate Logistic regression analyses were conducted to analyze the clinical and radiological features related to STAS; the clinical-CT radiomics model was constructed using Logistic regression and extreme gradient boosting (XGBoost) algorithms, and the predictive efficacy of the model was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC). The SHAP method was used to visualize the weights of the relevant features in the model for evaluating the efficacy.
Results Compared with the single-region radiomics model, the radiomics model of the tumor body+5 mm around the tumor had better diagnostic efficacy for STAS in lung adenocarcinoma. The AUC in the training group and the validation group was 0.831 (95%CI: 0.751–0.897) and 0.842 (95%CI: 0.719–0.941), respectively. Results of univariate and multivariate Logistic regression analyses indicated that lobulation and air cyst signs were independent risk factors for STAS in lung adenocarcinoma. The AUCs of the clinical model training group and validation group were 0.851 (95%CI: 0.784–0.911) and 0.821 (95%CI: 0.703–0.922), respectively. The clinical-CT radiological model (Combined_XGboost.model) combining radiomics score, lobulation sign, and air cyst sign had good evaluation efficacy for STAS in lung adenocarcinoma, with AUC in the training group and validation group being 0.902 (95%CI: 0.842–0.949) and 0.896 (95%CI: 0.802–0.968), respectively. The SHAP method visually displayed the interaction relationships between the features of the Combined_XGboost.model, and the case analysis showed that the model evaluation results were consistent with the histopathological results.
Conclusion The clinical-CT radiomics model constructed by combining the XGBoost machine learning algorithm and the visual analysis of its model features using the SHAP method can help clinicians make precise and intuitive preoperative evaluations of STAS in lung adenocarcinoma.