Medical Student Weill Cornell Medicine New York, New York
Rationale: Electronic health record (EHR) data has the potential to be a cost effective and efficient data source for pediatric epilepsy clinical research. This research requires accurately identifying well-defined cohorts, which can be achieved through the creation of computable phenotypes (CP). A computable phenotype is a “clinical condition, characteristic, or set of clinical features that can be determined solely from the data in EHRs and ancillary data sources and does not require chart review or interpretation by a clinician.” The International Classification of Disease (ICD)-10 codes are accurate for some epilepsies but there are substantial gaps for many sub-phenotypes. For example, there is little known about the accuracy of (1) diagnostic codes for important epilepsy risk factors like neonatal hypoxic ischemic encephalopathy (HIE); (2) new ICD-10 codes for juvenile myoclonic epilepsy (JME); and (3) codes for clinically important concepts like “treatment resistant epilepsy”. We developed and evaluated the performance of computable phenotypes for these three pediatric epilepsy-related conditions using ICD codes and other clinical concepts. Methods: This was a retrospective study from a single tertiary care institution. We identified gold standard cohorts of patients with neonatal HIE, JME, and pediatric treatment resistant epilepsy via existing registries and review of neurology clinical notes. Then, from the EHR, we extracted diagnostic and procedure codes for all children with a diagnosis of epilepsy and seizures. We used these codes to develop and evaluate CPs for each condition. We calculated sensitivity based on identification of the gold standard cohort, and positive predictive value (PPV) based on chart review. We iteratively modified CPs to maximize performance (sensitivity x PPV) and selected the “best performing” CP for each condition. Results: Gold standard cohorts included 47 patients with neonatal HIE, 247 patients with JME, and 99 patients with treatment resistant epilepsy. Table 1 lists the CPs. For JME, the best performing CP had poor sensitivity (32%, 95 CI [26-38]) but very high PPV (90.5% [83-95]). Of note, a second CP had higher sensitivity 71.7% [65-77]) but very low PPV 6% [1.3 – 16.5]. For neonatal HIE, the best performing CP had both high sensitivity (95.7% [85-99]) and PPV (100%[95-100]). For treatment resistant epilepsy, the best performing CP had a sensitivity of 86.9% [79-93] and a PPV of 69.6% [60-79]. Of note, other CPs had higher PPV (71.7% [61-80] and 87.3% [79-93]). Conclusions: Computable phenotypes yielded low (JME), medium (treatment resistant), and high (HIE) accuracy, demonstrating the heterogeneity of success using administrative data to identify cohorts important for pediatric epilepsy research. The CP for neonatal HIE has a relatively high sensitivity and PPV and may be sufficient for automated identification of patients for research cohorts. The accuracy of the treatment resistant epilepsy algorithm was fair, but would likely need to be supplemented by chart review in order create a sufficiently valid cohort. The poor accuracy for JME suggests a CP may not be feasible for this condition and other more sophisticated or labor-intensive techniques may be needed for cohort identification (e.g., natural language processing or manual chart review). Funding: Please list any funding that was received in support of this abstract.: American Academy of Neurology Medical Student Research Scholarship Click here to view image/table