DataSheet_2_Machine Learning Algorithms Identify Clinical Subtypes and Cancer in Anti-TIF1γ+ Myositis: A Longitudinal Study of 87 Patients.docx (14.72 kB)

DataSheet_2_Machine Learning Algorithms Identify Clinical Subtypes and Cancer in Anti-TIF1γ+ Myositis: A Longitudinal Study of 87 Patients.docx

Download (14.72 kB)
posted on 2022-02-14, 05:04 authored by Lijuan Zhao, Shuoshan Xie, Bin Zhou, Chuyu Shen, Liya Li, Weiwei Pi, Zhen Gong, Jing Zhao, Qi Peng, Junyu Zhou, Jiaqi Peng, Yan Zhou, Lingxiao Zou, Liang Song, Honglin Zhu, Hui Luo

Anti-TIF1γ antibodies are a class of myositis-specific antibodies (MSAs) and are closely associated with adult cancer-associated myositis (CAM). The heterogeneity in anti-TIF1γ+ myositis is poorly explored, and whether anti-TIF1γ+ patients will develop cancer or not is unknown at their first diagnosis. Here, we aimed to explore the subtypes of anti-TIF1γ+ myositis and construct machine learning classifiers to predict cancer in anti-TIF1γ+ patients based on clinical features.


A cohort of 87 anti-TIF1γ+ patients were enrolled and followed up in Xiangya Hospital from June 2017 to June 2021. Sankey diagrams indicating temporal relationships between anti-TIF1γ+ myositis and cancer were plotted. Elastic net and random forest were used to select and rank the most important variables. Multidimensional scaling (MDS) plot and hierarchical cluster analysis were performed to identify subtypes of anti-TIF1γ+ myositis. The clinical characteristics were compared among subtypes of anti-TIF1γ+ patients. Machine learning classifiers were constructed to predict cancer in anti-TIF1γ+ myositis, the accuracy of which was evaluated by receiver operating characteristic (ROC) curves.


Forty-seven (54.0%) anti-TIF1γ+ patients had cancer, 78.7% of which were diagnosed within 0.5 years of the myositis diagnosis. Fourteen variables contributing most to distinguishing cancer and non-cancer were selected and used for the calculation of the similarities (proximities) of samples and the construction of machine learning classifiers. The top 10 were disease duration, percentage of lymphocytes (L%), percentage of neutrophils (N%), neutrophil-to-lymphocyte ratio (NLR), sex, C-reactive protein (CRP), shawl sign, arthritis/arthralgia, V-neck sign, and anti-PM-Scl75 antibodies. Anti-TIF1γ+ myositis patients can be clearly separated into three clinical subtypes, which correspond to patients with low, intermediate, and high cancer risk, respectively. Machine learning classifiers [random forest, support vector machines (SVM), extreme gradient boosting (XGBoost), elastic net, and decision tree] had good predictions for cancer in anti-TIF1γ+ myositis patients. In particular, the prediction accuracy of random forest was >90%, and decision tree highlighted disease duration, NLR, and CRP as critical clinical parameters for recognizing cancer patients.


Anti-TIF1γ+ myositis can be separated into three distinct subtypes with low, intermediate, and high risk of cancer. Machine learning classifiers constructed with clinical characteristics have favorable performance in predicting cancer in anti-TIF1γ+ myositis, which can help physicians in choosing appropriate cancer screening programs.