Resources


Database Restricted Federated

MIMIC-IV-Ext-DiReCT

Source: Physionet

Large language models (LLMs) have recently demonstrated remarkable capabilities across a broad spectrum of tasks and applications, including the medical field. Models like GPT-4 excel in medical question answering but encounter challenges in interpr…

Published: Jan. 21, 2025. Version: 1.0.0 | DOI: 10.13026/yf96-kc87


Database Restricted Federated

Community-Acquired Pneumonia, Endotypes and Phenotypes (NACef): Prospective, observational cohort study of Translational Medicine

Source: Physionet

Community-Acquired Pneumonia (CAP) remains a prominent infectious process associated with elevated in-hospital morbidity and mortality rates. Through the exploration of phenotypes, endotypes, and biomarkers, it becomes feasible to identify individua…

Published: Jan. 21, 2025. Version: 1.0.0 | DOI: 10.13026/m71c-9345


Database Restricted Federated

Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information

Source: Physionet

The human voice contains complex acoustic markers which have been linked to important health conditions including dementia, mood disorders, and cancer. When viewed as a biomarker, voice is a promising characteristic to measure as it is simple to col…

bridge2ai voice

Published: Jan. 17, 2025. Version: 1.1 | DOI: 10.13026/249v-w155


Database Restricted Federated

A database of hand kinematics, high-density sEMG of forearm and wrist for motion intent recognition

Source: Physionet

Surface electromyography (sEMG) signals reflect spinal motor neuron activities and can be used as intuitive inputs for human-machine interaction (HMI) via movement intent recognition. The motor neuron potentials of far-field (wrist) and near-field (…

Published: Jan. 17, 2025. Version: 1.0.0 | DOI: 10.13026/ch3e-c195


Database Open Federated

SensSmartTech database of cardiovascular signals synchronously recorded by an electrocardiograph, phonocardiograph, photoplethysmograph and accelerometer

Source: Physionet

The SensSmartTech database comprises cardiovascular signals synchronously recorded by electrocardiograph (ECG), phonocardiograph (PCG), photoplethysmograph (PPG) and accelerometer (ACC) at heart rates at rest and after activity (HRs). It is composed…

Published: Dec. 19, 2024. Version: 1.0.0 | DOI: 10.13026/fy9p-n277


Database Credentialed Federated

MIMIC-IV-Ext-GPT-3_5-Generated-Discharge-Summaries-for-Low-Resource-Codes

Source: Physionet

This dataset comprises 9,606 Synthetic Discharge Summaries generated by GPT-3.5 based on combinations of ICD-10-code descriptions associated with real discharge summaries in MIMIC-IV. As part of the generation process, GPT-3.5 was also tasked to cod…

icd coding data augmentation large language model

Published: Dec. 16, 2024. Version: 1.0.0 | DOI: 10.13026/09ng-2614


Database Restricted Federated

Endoscapes2023, A Critical View of Safety and Surgical Scene Segmentation Dataset for Laparoscopic Cholecystectomy

Source: Physionet

Minimally invasive image-guided surgery heavily relies on vision. Deep learning models for surgical video analysis can support surgeons in visual tasks such as assessing the critical view of safety (CVS) in laparoscopic cholecystectomy, potentially …

surgical safety computer assisted interventions semantic segmentation surgical data science medical imaging analysis

Published: Dec. 11, 2024. Version: 1.0.0 | DOI: 10.13026/czwq-jh81


Database Credentialed Federated

CovIdentify Dataset

Source: Physionet

This dataset supports the study "A method for intelligent allocation of diagnostic testing by leveraging data from commercial wearable devices: a case study on COVID-19," which developed an Intelligent Testing Allocation (ITA) method. The …

Published: Nov. 25, 2024. Version: 1.0.0 | DOI: 10.13026/ncq1-vp79


Database Credentialed Federated

Northwestern ICU (NWICU) database

Source: Physionet

Retrospective medical data collection is essential for advancing patient care, offering insights and supporting the development of health technology. The Medical Information Mart for Intensive Care (MIMIC)-III database has been instrumental in provi…

Published: Nov. 19, 2024. Version: 0.1.0 | DOI: 10.13026/s84w-1829


Database Credentialed Federated

MS-CXR: Making the Most of Text Semantics to Improve Biomedical Vision-Language Processing

Source: Physionet

We release a new dataset, MS-CXR, with locally-aligned phrase grounding annotations by board-certified radiologists to facilitate the study of complex semantic modelling in biomedical vision–language processing. The MS-CXR dataset provides 116…

localization phrase grounding vision-language processing chest x-ray

Published: Nov. 15, 2024. Version: 1.1.0 | DOI: 10.13026/9g2z-jg61