DataSheet_1_Deep Learning Model for Classifying Metastatic Epidural Spinal Cord Compression on MRI.docx (2.13 MB)
Download file

DataSheet_1_Deep Learning Model for Classifying Metastatic Epidural Spinal Cord Compression on MRI.docx

Download (2.13 MB)
posted on 04.05.2022, 04:13 by James Thomas Patrick Decourcy Hallinan, Lei Zhu, Wenqiao Zhang, Desmond Shi Wei Lim, Sangeetha Baskar, Xi Zhen Low, Kuan Yuen Yeong, Ee Chin Teo, Nesaretnam Barr Kumarakulasinghe, Qai Ven Yap, Yiong Huak Chan, Shuxun Lin, Jiong Hao Tan, Naresh Kumar, Balamurugan A. Vellayappan, Beng Chin Ooi, Swee Tian Quek, Andrew Makmur

Metastatic epidural spinal cord compression (MESCC) is a devastating complication of advanced cancer. A deep learning (DL) model for automated MESCC classification on MRI could aid earlier diagnosis and referral.


To develop a DL model for automated classification of MESCC on MRI.

Materials and Methods

Patients with known MESCC diagnosed on MRI between September 2007 and September 2017 were eligible. MRI studies with instrumentation, suboptimal image quality, and non-thoracic regions were excluded. Axial T2-weighted images were utilized. The internal dataset split was 82% and 18% for training/validation and test sets, respectively. External testing was also performed. Internal training/validation data were labeled using the Bilsky MESCC classification by a musculoskeletal radiologist (10-year experience) and a neuroradiologist (5-year experience). These labels were used to train a DL model utilizing a prototypical convolutional neural network. Internal and external test sets were labeled by the musculoskeletal radiologist as the reference standard. For assessment of DL model performance and interobserver variability, test sets were labeled independently by the neuroradiologist (5-year experience), a spine surgeon (5-year experience), and a radiation oncologist (11-year experience). Inter-rater agreement (Gwet’s kappa) and sensitivity/specificity were calculated.


Overall, 215 MRI spine studies were analyzed [164 patients, mean age = 62 ± 12(SD)] with 177 (82%) for training/validation and 38 (18%) for internal testing. For internal testing, the DL model and specialists all showed almost perfect agreement (kappas = 0.92–0.98, p < 0.001) for dichotomous Bilsky classification (low versus high grade) compared to the reference standard. Similar performance was seen for external testing on a set of 32 MRI spines with the DL model and specialists all showing almost perfect agreement (kappas = 0.94–0.95, p < 0.001) compared to the reference standard.


A DL model showed comparable agreement to a subspecialist radiologist and clinical specialists for the classification of malignant epidural spinal cord compression and could optimize earlier diagnosis and surgical referral.