MP70-07: High-performing Scalable Extraction of Elements of Urinary Tract Dilation Classification Using Natural Language Processing Algorithms for Neonatal Ultrasound Reports
Introduction: The urinary tract dilation (UTD) classification system provides objective assessment for hydronephrosis in young children. However, the lack of specificity regarding UTD categorization in radiology reports causes difficulty in both clinical management and research. We seek to extract UTD elements and classification from early postnatal ultrasound (US) reports using the cutting-edge natural language processing (NLP) machine learning algorithms. Methods: Radiology records from our institution were reviewed to identify infants age 0-90 days undergoing early ultrasound for antenatal UTD. The report and images were reviewed by the study team physicians to create the ground truth of UTD classification and components (primary outcome). Data were split into training/testing sets by 85:15 ratio. Bidirectional Encoder Representations from Transformers (BERT) language representation models were used as the basis of classification model. The model was fine-tuned with a head consisting of a drop layer, a fully connected layer and a binary classification layer. The model performance was evaluated with out-of-sample testing set. Results: 2500 early (0-90 days) US reports were included. The model performance for the out-of-sample testing dataset is very high (AUC>0.9 for all). Accuracy, F1 score and AUC for UTD features in out-of-sample testing dataset are listed in Table 1. Conclusions: By applying deep state-of-the-art NLP neural networks, we developed a high-performing and scalable solution to extract UTD components from unstructured ultrasound reports. This can potentially help reduce miscommunications and facilitate large-scale computer vision research for children with hydronephrosis. SOURCE OF Funding: Internal