Database Credentialed Access
Trial Files: Keeping Physicians Up to Date on RCTs Using Large Language Models
Katarina Zorcic , Bryant Lim , Michael Fralick
Published: May 7, 2024. Version: 1.0.0
When using this resource, please cite:
(show more options)
Zorcic, K., Lim, B., & Fralick, M. (2024). Trial Files: Keeping Physicians Up to Date on RCTs Using Large Language Models (version 1.0.0). Health Data Nexus. https://doi.org/10.57764/e79t-fj74.
Abstract
Physicians and medical trainees depend on randomized clinical trials to inform patient management, but keeping up with new studies is a challenge due to the high demands of clinical and academic responsibilities. Open-access large language models can potentially generate short, descriptive, and reliable summaries of clinical trials to inform users on emerging literature in an efficient manner. In 2023, we launched Trial Files, a bi-weekly newsletter that summarizes the results of RCTs published in premier clinical journals, New England Journal of Medicine, Annals of Internal Medicine, Journal of the American Medical Association (JAMA), JAMA Internal Medicine, and Lancet. For each RCT, we identifiy the unique trial registration identifier and then leverage the associated registry application program interface (API) to extract additional datapoints for each study. The data from MEDLINE and official registries are then stored in the cloud. Trial summaries for the newsletter are generated using large language models accessible by openAI APIs. The Trial Files newsletter has expanded to additional subspecialties due to interest from practitioners, including thrombosis, nephrology, cardiology, and IBD.
Background
For physicians, keeping up with emerging clinical trials is essential for practicing evidence-based medicine, but also challenging given their academic and clinical responsibilities. Here, we explored the potential of open-access large language models to generate concise and reliable summaries of clinical trials to help medical professionals digest the medical literature more efficiently.
Methods
Abstracts of published randomized clinical trials from the New England Journal of Medicine (NEJM), Annals of Internal Medicine, Journal of the American Medical Association (JAMA), JAMA Internal Medicine, and Lancet were extracted from the MEDLINE database. Using the text-davinci-003 application programming interface (API; OpenAI), input prompts were optimized to generate abstract summaries. Summaries were then published in a biweekly newsletter (Trial Files, trialfiles.substack.com) targeting internal medicine physicians. Accurate capture of key trial information was manually evaluated.
Data Description
The data is contained in one file. Each row (n=568) corresponds to one randomized controlled trial. The columns correspond to the respective data extracted from that randomized controlled trial. These are:
- Title
- Authors
- Year of Publication
- Blinding (single, double, etc)
- Phase
- Condition of interest
- Intervention group
- Comparator group
- Total sample size
- Patient population
- Primary outcome (intervention, comparator, overall)
- Safety outcome
- Conclusion
- LLM summary
- Sponsor
- Acronym
- Inclusion criteria
- Exclusion criteria
- DOI
- Registry identifier
- URL
- Comments
- Date Retrieved
Usage Notes
Short, descriptive, and reliable summaries of clinical trials can be found in the "LLM Summary" column.
Release Notes
v1.0: Initial release
Ethics
The authors declare no ethics concerns.
Conflicts of Interest
The author(s) have no conflicts of interest to declare.
Access
Access Policy:
Only credentialed users who sign the DUA can access the files.
License (for files):
Health Data Nexus Contributor Review Health Data License 1.0
Data Use Agreement:
T-CAIREM Data Use Agreement
Required training:
TCPS 2: CORE 2022
Discovery
DOI (version 1.0.0):
https://doi.org/10.57764/e79t-fj74
DOI (latest version):
https://doi.org/10.57764/z3vj-sm95
Corresponding Author
Files
- be a credentialed user
- complete required training:
- TCPS 2: CORE 2022 You may submit your training here.
- sign the data use agreement for the project