Research Fellow, PhD Candidate Mayo Clinic Rochester Rochester, Minnesota, United States
Introduction: Chordomas are rare tumors from notochordal remnants and account for 1% to 4% of all primary bone malignancies, often arising from the clivus and sacrum. Despite margin-negative resection and postoperative radiotherapy, chordomas often recur. Further, immunohistochemical(IHC) markers have not been assessed to predict chordoma recurrence. We aimed to identify the IHC markers that were predictive of post-operative long-term chordoma recurrence(≥1 year)using trained multiple tree-based machine learning (ML) algorithms.
Methods: We reviewed the records of patients who had treatment for clival and spinal chordomas between January 2017 and June 2021 across the Mayo Clinic enterprise. Demographics, type of treatment, histopathology, and other relevant clinical factors were abstracted from each patient record. Decision tree and random forest classifiers were trained and tested to predict the long-term recurrence based on unseen data using an 80/20 split.
Results: One hundred fifty-one patients diagnosed and treated for chordomas were identified: 58 chordomas from the clivus, 48 chordomas of the mobile spine, and 45 sacrococcygeal of origin. Patients diagnosed with cervical chordomas were the oldest among all groups (58 ±14 years; p=0.009). Most patients were males (N=91;60.3%) and white (N=139;92.1%). Most patients underwent surgical resection with or without radiation therapy (N=129; 85.4%). Subtotal Resection (STR) followed by radiation therapy (N=51; 33.8%), was the most common treatment modality, followed by Gross Total resection(GTR), then radiation therapy (N=43; 28.5%). The multivariate analysis shows that S100 and pan-cytokeratin are more likely to increase the risk of postoperative recurrence (OR= 3.67; CI= [1.09,12.42],p=0.03). In the decision tree analysis, a clinical follow-up >1897 days was found in 37 % of encounters and a 90% chance of being classified for recurrence(Accuracy= 77%). Random forest analysis (n = 500 trees) showed that the patient’s age, type of surgical treatment, tumor location,S100, pan-cytokeratin, and EMA are the factors predicting long-term recurrence.
Conclusion : Our immunohistochemical and clinicopathological variables combined with tree-based ML tools successfully demonstrate a high capacity to identify the patient's recurrence pattern with an accuracy of 77%. S100, pan-cytokeratin, and EMA were the immunohistochemical drivers of recurrence. This shows the power of ML algorithms in analyzing and predicting outcomes of rare conditions of a small sample size.