Washington University School of Medicine St. Louis, MO
Benjamin Rogers, MD1, Barrett Rogers1, Marzio Frazzoni, MD2, Edoardo Savarino, MD3, Sabine Roman, MD, PhD4, Daniel Sifrim, MD, PhD5, C. Prakash Gyawali, MD1; 1Washington University School of Medicine, St. Louis, MO; 2Baggiovara Hospital, Modena, Emilia-Romagna, Italy; 3University of Padua, Padua, Veneto, Italy; 4Edouard Herriot Hospital, Lyon, Auvergne, France; 5Wingate Institute, London, England, United Kingdom
Introduction: Automated analysis overcounts reflux episodes on pH impedance testing. Accurate reviewer interpretation is critical, but there is a paucity of data on reflux episode characteristics associated with inter-reviewer agreement. Methods: Ambulatory pH impedance monitoring from randomly selected GERD patients from 4 international institutions were all analyzed by 5 reviewers following a consensus meeting (Wingate consensus) to define identification standards for reflux episodes. Episodes were defined as retrograde propagation of ≥50% impedance drop for ≥4 seconds in the distal two impedance channels. Metadata from pH impedance studies were exported to a dedicated software tool designed to compare episode-by-episode time between reviewers within a ±7.5s window. Acidity of reflux events, acid clearance time (ACT), and bolus clearance time (BCT) were compared between episodes identified by all reviewers against those identified by automated analysis, and one to four reviewers, respectively. Results: Of the 19 patients (median age 52 years, 78.9% F), median acid exposure time (AET) was 11.7% (interquartile range, 8-15%) and all had typical symptoms. A median of 92 reflux episodes (IQR 79-119) were identified per patient (acidic: 73, IQR 66-113; non-acidic: 10, IQR 7-20). A total of 979 episodes were identified by all reviewers. The majority (89.1%) were acidic in contrast to automated analysis (1719 episodes, 78.4% acidic) as well as single reviewer identified episodes (71.6% acidic, p≤0.009 for each comparison, Fig. 1). While 277 more episodes (28.3%) were identified when one outlier value was discarded, the proportion of acid to non-acid events was similar (87.1% acidic, p=0.3). Three reviewer consensus resulted in an increase by 202 episodes (16.1%) in reflux episode identification (86.1% acidic) while two reviewer consensus resulted in an additional 195 episodes (13.4% increase, 84.5% acidic). The proportion of acidic episodes identified by only one reviewer (342 episodes, 71.6% acidic) was significantly lower than that identified by ≥2 reviewers (p< 0.001 for each comparison, Fig. 1). Although no differences were found in BCT between groups (p=0.84), longer ACT was associated with higher agreement among reviewers (p< 0.001 across groups, Fig. 2). Discussion: Inter-reviewer variability exists even among expert reviewers in reflux episode identification on pH impedance monitoring. Reflux episodes containing acid are significantly more likely to be identified by multiple reviewers.
Comparison of proportions of acid and non-acid reflux episodes as identified by reviewer combinations
Bolus clearance time and acid clearance time compared between reflux episodes identified by reviewer combinations