Cardio Oncology
Sumayyah Ishfaq
Medical Student
University of Leeds
Slough, United Kingdom
Sumayyah Ishfaq
Medical Student
University of Leeds
Slough, United Kingdom
Hunain Shiwani, MD
Clinical Research Fellow
University College London and Barts Heart Centre, United Kingdom
Zuhayr Siddiqui, MD
Doctor
Mid Yorkshire Hospitals NHS Trust, United Kingdom
Vinton Cheng, MD, PhD
Oncologist
Leeds Institute of Medical Research, United Kingdom
Nigel Artis, MD
Consultant Cardiologist
Mid Yorkshire Hospitals NHS Trust, United Kingdom
Dilip Oswal, MD
Consultant Radiologist
Mid Yorkshire Hospitals NHS Trust, United Kingdom
Charlotte Manisty
Consultant Cardiologist
University College London and Barts Heart Centre
London, England, United Kingdom
James C. Moon, MD
Clinical Director, Imaging
Barts Heart Centre and UCL
London, England, United Kingdom
Rhodri Davies, MD, PhD
Associate clinical professor
University College London
London, Wales, United Kingdom
Peter P. Swoboda, PhD
Consultant Cardiologist & Senior Lecturer
University of Leeds
Leeds, England, United Kingdom
Patients undergoing treatment with trastuzumab for breast cancer undergo serial surveillance of left ventricular ejection fraction (LVEF) to identify chemotherapy-induced cardiotoxicity. Most guidelines recommend stopping chemotherapy if there is a >10% drop in LVEF to < 50% (FDA definition), (1). CMR Artificial Intelligence (AI) based assessment of LVEF may offer clinical advantages due to high reproducibility and fast analysis times. We aim to evaluate the accuracy of two AI models in the early identification of patients with chemotherapy-induced cardiotoxicity.
Methods:
77 patients starting Trastuzumab therapy between February 2018 and October 2020. Patients underwent cine CMR for LV volumes every three months during chemotherapy. Scans were clinically reported by a SCMR level 3 accredited reporter including LVEF. Patients were followed up for clinical events including cessation of chemotherapy due to cardiotoxicity. The scans were re-analysed in commercially available Medis Suite (4.0.24.4) using AutoQ AI segmentation (QMass 8.1.122.4) with no human viewing or correction (AI 1) and a clinically validated AI model developed at University College London (UCL) (2,3) (AI 2). LVEF results were compared using correlation and Bland Altman analyses.
Results:
A total of 367 scans were analysed, with patients having an average of 3 ± 1.1 scans. Median [IQR] baseline LVEF was 61.0[58.0-66.0] for clinical expert, 61.2[56.6-66.7] for AI 1 and 64.1[59.2-69.4] for AI 2.
Baseline EF: There was good correlation between techniques (Clinical vs AI 2 r2=0.64 Clinical vs AI 1 r2=0.61, AI 1 vs AI 2 r2=0.74, all P< 0.001), with AI 2 measuring higher than either expert analysis (2.8% higher) or AI 1 (2.0% higher).
Peak Drop in EF: There was weaker correlation for measuring interval change, with the strongest correlation between the two AI methods (Clinical vs AI 2 r2=0.30, P< 0.001, Clinical vs AI 1 r2=0.35, P< 0.001, AI 1 vs AI 2, r2=0.49, P< 0.001). Agreement was highest between the AI techniques (0.65% difference). Expert analysis identified a smaller maximum fall in LVEF than either AI 2 (1.5% lower) or AI 1 (2.2% lower). Cardiotoxicity (FDA definition) was met in 18(23%) patients by any method – 6(8%), 13(17%) and 9(12%) patients by clinical expert, AI 1 and AI 2 analyses respectively. However, only 3(4%) fulfilled the definition by all three methods and 7(9%) by at least 2 methods.
Conclusion: Human and AI EF measurement techniques have discrepancies in agreement with individual biases. Assessment of interval change is imperfect and there is a marked lack of agreement for defining cardiotoxicity. Assessment of cardiotoxicity requires measurement of precision (for interval change) and technique specific reference ranges/definitions of impairment. Given biases, a LVEF < 50% for all techniques is inappropriate.