Database Credentialed Access

Trial Files: Keeping Physicians Up to Date on RCTs Using Large Language Models

Katarina Zorcic Bryant Lim Michael Fralick

Published: May 7, 2024. Version: 1.0.0


When using this resource, please cite: (show more options)
Zorcic, K., Lim, B., & Fralick, M. (2024). Trial Files: Keeping Physicians Up to Date on RCTs Using Large Language Models (version 1.0.0). Health Data Nexus. https://doi.org/10.57764/e79t-fj74.

Abstract

Physicians and medical trainees depend on randomized clinical trials to inform patient management, but keeping up with new studies is a challenge due to the high demands of clinical and academic responsibilities. Open-access large language models can potentially generate short, descriptive, and reliable summaries of clinical trials to inform users on emerging literature in an efficient manner. In 2023, we launched Trial Files, a bi-weekly newsletter that summarizes the results of RCTs published in premier clinical journals, New England Journal of Medicine, Annals of Internal Medicine, Journal of the American Medical Association (JAMA), JAMA Internal Medicine, and Lancet. For each RCT, we identifiy the unique trial registration identifier and then leverage the associated registry application program interface (API) to extract additional datapoints for each study. The data from MEDLINE and official registries are then stored in the cloud. Trial summaries for the newsletter are generated using large language models accessible by openAI APIs. The Trial Files newsletter has expanded to additional subspecialties due to interest from practitioners, including thrombosis, nephrology, cardiology, and IBD.

 


Background

For physicians, keeping up with emerging clinical trials is essential for practicing evidence-based medicine, but also challenging given their academic and clinical responsibilities. Here, we explored the potential of open-access large language models to generate concise and reliable summaries of clinical trials to help medical professionals digest the medical literature more efficiently.


Methods

Abstracts of published randomized clinical trials from the New England Journal of Medicine (NEJM), Annals of Internal Medicine, Journal of the American Medical Association (JAMA), JAMA Internal Medicine, and Lancet were extracted from the MEDLINE database. Using the text-davinci-003 application programming interface (API; OpenAI), input prompts were optimized to generate abstract summaries. Summaries were then published in a biweekly newsletter (Trial Files, trialfiles.substack.com) targeting internal medicine physicians. Accurate capture of key trial information was manually evaluated. 


Data Description

The data is contained in one file. Each row (n=568) corresponds to one randomized controlled trial. The columns correspond to the respective data extracted from that randomized controlled trial. These are:

  • Title
  • Authors
  • Year of Publication
  • Blinding (single, double, etc)
  • Phase
  • Condition of interest
  • Intervention group
  • Comparator group
  • Total sample size
  • Patient population
  • Primary outcome (intervention, comparator, overall)
  • Safety outcome
  • Conclusion
  • LLM summary
  • Sponsor
  • Acronym
  • Inclusion criteria
  • Exclusion criteria
  • DOI
  • Registry identifier
  • URL
  • Comments
  • Date Retrieved

Usage Notes

Short, descriptive, and reliable summaries of clinical trials can be found in the "LLM Summary" column.


Release Notes

v1.0: Initial release


Ethics

The authors declare no ethics concerns.


Conflicts of Interest

The author(s) have no conflicts of interest to declare.


Share
Access

Access Policy:
Only credentialed users who sign the DUA can access the files.

License (for files):
Health Data Nexus Contributor Review Health Data License 1.0

Data Use Agreement:
T-CAIREM Data Use Agreement

Required training:
TCPS 2: CORE 2022

Discovery

DOI (version 1.0.0):
https://doi.org/10.57764/e79t-fj74

DOI (latest version):
https://doi.org/10.57764/z3vj-sm95

Corresponding Author
You must be logged in to view the contact information.

Files