Resources


Database Credentialed Federated

MIMIC-IV-Ext-22MCTS: A 22 Millions-Event Temporal Clinical Time-Series Dataset with Relative Timestamp

Source: Physionet

Clinical risk prediction based on machine learning algorithms plays a vital role in modern healthcare. A crucial component in developing a reliable prediction model is a high-quality dataset with time series clinical events. In this work, we release…

clinical event annotation mimic time series temporal annotation

Published: Sept. 29, 2025. Version: 1.0.0 | DOI: 10.13026/dkj6-r828


Database Credentialed Federated

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Source: Physionet

Emergency department (ED) patients often present with undiagnosed complaints, and can exhibit rapidly evolving physiology. Therefore, data from continuous physiologic monitoring, in addition to the electronic health record, is essential to understan…

Published: Sept. 25, 2025. Version: 1.0.1 | DOI: 10.13026/wvyw-g663


Database Credentialed Federated

RadVLM Instruction Dataset

Source: Physionet

We release the RadVLM instruction dataset, a large-scale resource used to train the RadVLM model on diverse radiology tasks. The dataset contains 1,115,021 image–instruction pairs spanning five task families: (i) report generation from frontal…

vision-language models medical ai chest x-rays

Published: Sept. 25, 2025. Version: 1.0.0 | DOI: 10.13026/et5g-h222


Database Credentialed Federated

MIMIC-Ext-DrugDetection

Source: Physionet

This project shares a large, annotated drug detection dataset created from MIMIC-III/IV discharge summaries. The dataset was developed to address the challenge of identifying substance use behaviors in Electronic Health Records (EHRs), where critica…

prescription opioid misuse cannabis benzodiazepine misuse ehr injection drug use heroin methamphetamine substance use multi-label cocaine drug detection polysubstance use mimic-iv mimic-iii clinical notes

Published: Sept. 25, 2025. Version: 1.0.0 | DOI: 10.13026/0kyx-r485


Database Restricted Federated

EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs

Source: Physionet

This dataset contains a de-identified collection of 100,000 12-lead electrocardiograms (ECGs) with paired structural heart disease (SHD) labels derived from echocardiography, collected at Columbia University Irving Medical Center. Each ECG is provid…

aortic stenosis deep learning health equity cardiovascular screening valvular heart disease heart failure digital health ecg machine learning ai model deployment left ventricular dysfunction artificial intelligence clinical decision support ai in healthcare population health electrocardiogram transthoracic echocardiogram structural heart disease

Published: Sept. 16, 2025. Version: 1.1.0 | DOI: 10.13026/3ykd-bf14


Database Credentialed Federated

RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports

Source: Physionet

Radiology reports are essential for clinical care but pose challenges for automated processing due to their unstructured nature. Existing datasets like RadGraph-1.0 focus narrowly on chest X-rays (CXR), limiting their applicability. We introduce Rad…

Published: Sept. 12, 2025. Version: 1.0.0 | DOI: 10.13026/j8e7-pr22


Database Open Federated

Myocardial perfusion scintigraphy image database

Source: Physionet

This database provides a collection of myocardial perfusion scintigraphy images in DICOM format with all metadata and segmentations (masks) in NIfTI format. The images were obtained from patients undergoing scintigraphy examinations to investigate c…

myocardial perfusion systems modeling myocardial perfusion scintigraphy dicom metadata artificial intelligence ventricular walls coronary artery disease convolutional neural networks automated segmentation clinical diagnosis anonymization nifti

Published: Sept. 10, 2025. Version: 1.0.0 | DOI: 10.13026/ce2z-dw74


Database Restricted Federated

mcPHASES: A Dataset of Physiological, Hormonal, and Self-reported Events and Symptoms for Menstrual Health Tracking with Wearables

Source: Physionet

Individuals who menstruate are frequently led to believe that there is a standard menstrual cycle, typically characterized as 28 days in length with predictable and uniform patterns. This framing often emphasizes cycle dates as the only relevant met…

hormones menstrual health multimodal health wearables health sensor data womens health

Published: Sept. 10, 2025. Version: 1.0.0 | DOI: 10.13026/zx6a-2c81


Database Credentialed Federated

MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples

Source: Physionet

Large language models (LLMs) have shown impressive capabilities in solving a wide range of tasks based on human instructions. However, developing a conversational AI assistant for electronic health record (EHR) data remains challenging due to the la…

medical question answering large language models instruction tuning

Published: Sept. 9, 2025. Version: 1.0.0 | DOI: 10.13026/e5bq-pr14


Database Restricted Federated

HYAMD High-Resolution Fundus Image Dataset for age related macular degeneration (AMD) Diagnosis

Source: Physionet

The Hillel Yaffe Age Related Macular Degeneration (HYAMD) longitudinal dataset comprises of 1,560 Digital Fundus Images (DFIs) of 325 patients examined at the Hillel Yaffe Medical Center (Hadera, Israel, Helsinki approval number 0048-24-HYMC) provid…

Published: Sept. 9, 2025. Version: 1.0.0 | DOI: 10.13026/ydf1-z238