Medical Librarian College of American Pathologists Buffalo Grove, Illinois
Objectives: Systematic reviews (SRs) are time-intensive, and the use of artificial intelligence (AI) has the potential to reduce the time required for the systematic review process. Our objective is to determine if the AI function of DistillerSR is comparable to the conventional dual review by human subject matter experts during the title/abstract phase of the SR process.
Methods: This analytical comparative study analyzes the AI function of DistillerSR for the title/abstract review of references during SRs. We performed a retrospective review of two SRs using our conventional method of dual review by human subject matter experts. To determine the equivalency of DistillerSR’s AI function, we created new projects using the same pool of references as the original projects and created an AI training set using the historical data from the original review. Then, we applied the DistillerSR’s AI tool to the remaining references and reviewed and compared the outcomes of the review methods to investigate equivalency. We calculated the sensitivity and specificity of the AI function and assessed the similarities and differences in the results obtained.
Results: To determine sensitivity and specificity, we compared the articles included at the title/abstract screening stage with the final pool of evidence included in the published guidelines, as that is of primary importance to our subject matter experts. Our pilot project sensitivity calculations for DistillerSR’s AI tool do not currently meet our acceptability threshold to allow the inclusion of this AI tool into our guideline development process.
Conclusions: Further review is planned to continue evaluating the individual projects for possible reasons for the discrepancies. These projects vary in size and scope and this may significantly impact the tool’s performance as well as the literature search itself. There is potential that AI can be combined with human review in order to maximize the AI tool’s capabilities while incorporating the expertise of human subject matter experts, but it cannot be used as a single reviewer.