Purpose: A dedicated contrast-enhanced mammography (CEM) BI-RADS lexicon is being developed built upon BI-RADS terms used for breast MRI. The purpose of this study is to evaluate the applicability of these terms to CEM in clinical practice by assessing inter-reader agreement.
Materials and Methods: All CEM exams performed at a single academic institution between December 2014 and December 2020 were retrospectively reviewed for cases of biopsy-proven cancer. A single cancer was analyzed per CEM exam. Cases were excluded if the cancer had been recently sampled (within 1 year of CEM), were within 2 cm of a recent biopsy site, or within the same quadrant as recent surgery. CEM exams were reviewed by three radiologists for new CEM BI-RADS criteria. Inter-reader variability for these criteria was evaluated using Cohen’s kappa and percentage agreement.
Results: 112 CEM exams were included in 112 patients (average age 58.1 years ± 9.3). There was moderate inter-reader agreement for background parenchymal enhancement (BPE; k=0.44) and tissue density (k=0.60), substantial agreement for visibility on low-energy imaging (k=0.69), and near perfect agreement for visibility on recombined imaging (k=0.81).
There was substantial agreement (k=0.70) of low-energy finding type (mass, asymmetry, distortion, calcifications) and for recombined finding type (k=0.76; mass, non-mass enhancement [NME], enhancing asymmetry). Moderate agreement was also noted for more granular descriptors including extent of enhancement relative to low-energy finding (k=0.51), mass internal enhancement pattern (IEP; k=0.46), and NME distribution pattern (k=0.49).
All other granular-level descriptors showed at most fair agreement including conspicuity of findings relative to BPE (k=0.31), mass shape (k=0.27), and mass margin (k=0.39). There was less than chance agreement for NME IEP (k=-0.08). A majority of overlap in description was between heterogenous and clumped enhancement.
There were too few cases of enhancing asymmetry to determine agreement of the internal enhancement pattern.
Conclusion: Initial results suggest moderate-to-substantial agreement for macro-level descriptors including BPE, tissue density, low-energy finding type, and recombined-imaging finding type. More granular level descriptors of shape, margins, and IEP showed only fair agreement, with the worst agreement shown for NME IEP. Additional assessment and training of new CEM BI-RADS lexicon is worthwhile to determine value of granular descriptors for classifying lesions seen on recombined imaging.
Clinical Relevance Statement: Understanding inter-reader variability as it relates to the new CEM BI-RADS lexicon helps identify terminology that may either require additional training or may have limited value in clinical practice.