Abstract:
Objective To construct a prediction model for the optimal initial dose of levothyroxine sodium tablets in patients with differentiated thyroid cancer (DTC) after 131I treatment by machine learning.
Methods A total of 266 DTC patients (78 males (male group) and 188 females (female group), aged 18 to 70 (40.0+11.5) years old) who received 131I treatment followed by thyroid stimulating hormone (TSH) suppressive therapy in the Department of Nuclear Medicine, Konggang Hospital, Tianjin Cancer Hospital between November 2019 and November 2020 were retrospectively analyzed for final compliance. A total of 16 clinical and biochemical indicators and data related to thyroid function were obtained, and each adjusted dose of levothyroxine sodium tablets was collected from patients with regular post-discharge rechecks. The indicators strongly correlated with the optimal dose of levothyroxine sodium tablets were screened by calculating random forest feature importance. A wide variety of regression models were constructed with the selected indicators and optimal dose of levothyroxine sodium tablets as independent and dependent variables, respectively. Selected the most accurate model using the cross-validation method. Counting data were compared between male and female groups using the chi-square test of independence.
Results Body weight, height, body mass index, body surface area, hemoglobin, mean corpuscular volume, systolic/diastolic blood pressure, postoperative parathyroid hormone, and the reaching levothyroxine sodium tablets dose of 266 patients were (68.4±12.9) kg, (165.8±12.8) cm, 24.6±3.5, (1.9±0.2) m2, (140.1±19.1) g/L, (88.6±5.5) fl, (125.7±18.9) mm Hg/(82.7±12.4) mm Hg, (4.1±2.2) pmol/L, and (117.0±30.1) μg/d, respectively. Six indicators with a strong correlation with levothyroxine sodium tablets dose were screened using the feature selection method. According to the order of importance, the six indicators were body surface area, body weight, hemoglobin, height, body mass index, and age. Their average random forest importances were 0.2805, 0.1951, 0.1315, 0.1252, 0.1080 and 0.0819 respectively. The support vector regression (SVR) model using radial basis kernel had the highest accuracy (53.4%, 142/266) by cross-training validation. In addition, in this study, SVR's accuracy was significantly higher than the first success rate of empirical administration of levothyroxine sodium tablets (15.0%, 40/266). Moreover, the SVR model's accuracy was compared by dividing the patients into different subgroups according to gender. The results showed that the female patient group's accuracy was significantly higher than that of the male group (60.6% (114/188) vs. 35.9% (28/78)), with a statistically significant difference (χ2=13.51, P<0.001).
Conclusions The SVR model is constructed based on machine learning and is expected to improve the first success rate of levothyroxine sodium tablets in DTC patients after being treated with 131I. It is more pronounced in female patients and helps to improve the quality of life and prognosis among DTC patients.