9 - Using Natural Language Processing to Improve a Database’s Search Tool

Friday, May 6, 2022

12:30 PM – 1:30 PM CT

Presenter and Author(s)

Erin Keenan

Medical Research Librarian III
VisualDx
Irondequoit, New York

Author(s)

TB

Thomas Baumgartner

Medical Data Librarian
VisualDx

Background: There is a need to improve our medical database’s front end user search tool. This will increase the usability of our database by making our content easier for external users to discover. This is an interdepartmental project between software engineering and librarians. Our objective is to increase the amount of successful user queries by updating our search tool with a form of augmented intelligence (AI) called natural language processing (NLP). Using NLP enables the search tool to interpret user queries the same way a human can. We purchased access to an AI development program to facilitate this work.

Description: The first task was to prepare a large dataset of user search queries on which the AI development program will train. The librarians manually annotated or ‘tagged’ each term in a query with the correct attribute. For example, a query could be “diabetic with SOB 2 weeks”. The following attributes are: diabetic [medical history], SOB [finding], 2 weeks[timing].

The software engineers used this annotated dataset to create a new AI model which formed the base of the new search tool. The second step was to test its performance. The librarians developed another dataset of simple search queries matched with the desired result. For example, a search query “45 yo with psoriasis” and its matched result “Psoriasis Adult Topic Page”. The software engineers used this test dataset to continue to edit the AI model and improve the accuracy of the search.

Conclusion: This project is still ongoing. Our final expected outcome is that users will be able to search in the database with natural language queries that seamlessly match with accurate content. To facilitate this functionality, the new AI model will be able to read and interpret natural language.