Resources


Database Credentialed Federated

FDTooth: Intraoral Photographs and Cone-Beam Computed Tomography Images for Fenestration and Dehiscence Detection

Source: Physionet

FDTooth is a comprehensive dataset designed for the automated detection of fenestration and dehiscence (FD) in anterior teeth, combining intraoral photographs and corresponding cone-beam computed tomography (CBCT) images from 241 patients aged 9 to …

Published: May 5, 2025. Version: 1.0.0 | DOI: 10.13026/v9xk-dy61


Database Credentialed Federated

MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters

Source: Physionet

While increasing patients' access to medical documents improves medical care, this benefit is limited by varying health literacy levels and complex medical terminology. Large language models (LLMs) offer solutions by simplifying medical information.…

Published: May 5, 2025. Version: 1.0.0 | DOI: 10.13026/f566-h049


Database Open Federated

Minute level step counts and physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014

Source: Physionet

The National Health and Nutrition Examination Survey (NHANES) is a nationally representative study that collects demographic, socioeconomic, dietary, and health-related information from 10,000 Americans annually. In Wave G (2011-2012) and Wave H (20…

steps accelerometry physical activity nhanes

Published: May 5, 2025. Version: 1.0.1 | DOI: 10.13026/9n0r-tv02


Database Credentialed Federated

Medical Expert Annotations of Unsupported Facts in Doctor-Written and LLM-Generated Patient Summaries

Source: Physionet

Large language models in healthcare can generate informative patient summaries while reducing the documentation workload of healthcare professionals. However, these models are prone to producing hallucinations, that is, generating unsupported inform…

Published: April 30, 2025. Version: 1.0.1 | DOI: 10.13026/gedc-j464


Database Restricted Federated

Swiss-Mammo: A physician-written, synthetic dataset of German mammography reports

Source: Physionet

This dataset, Swiss-Mammo, contains 28 manually constructed German mammography reports, each paired with an English translation. The reports are stratified across BI-RADS categories 0 through 6, with three reports per category. All reports were manu…

mammography radiology structured reporting bi-rads

Published: April 30, 2025. Version: 1.0.0 | DOI: 10.13026/mrg5-ja22


Database Restricted Federated

DREAMT: Dataset for Real-time sleep stage EstimAtion using Multisensor wearable Technology

Source: Physionet

Sleep is an intrinsic part of human life, and recent advancements in wearable technology and machine learning have promised continuous and non-invasive methods of tracking sleep health and patterns, providing an important facet to a more holistic un…

sleep disorders wearable biomedical time series classification

Published: April 30, 2025. Version: 2.1.0 | DOI: 10.13026/7r9r-7r24


Database Restricted Federated

MIMIC-IV-Ext-Apixaban-Trial-Criteria-Questions

Source: Physionet

Large-language models (LLMs) show promise for extracting information from clinical notes. Deploying these models at scale can be challenging due to high computational costs, regulatory constraints, and privacy concerns. To address these challenges, …

clinical q and a evaluation set clinical trial eligibility

Published: April 30, 2025. Version: 1.0.0 | DOI: 10.13026/4p6q-vb04


Database Restricted Federated

MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions

Source: Physionet

Large-language models (LLMs) show promise for extracting information from clinical notes. Deploying these models at scale can be challenging due to high computational costs, regulatory constraints, and privacy concerns. To address these challenges, …

large language models synthetic data distillation clinical trial eligibility

Published: April 22, 2025. Version: 1.0.0 | DOI: 10.13026/30k0-av04


Database Open Federated

SHDB-AF: a Japanese Holter ECG database of atrial fibrillation

Source: Physionet

Saitama Heart Database Atrial Fibrillation (SHDB-AF) is a novel open-sourced Holter ECG database from Japan, containing data from 122 unique subjects with paroxysmal atrial fibrillation. Among the 128 recordings, 98 contain raw ECG data with rhythm …

atrial fibrillation ecg holters

Published: April 16, 2025. Version: 1.0.1 | DOI: 10.13026/n6yq-fq90


Database Credentialed Federated

Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information

Source: Physionet

The human voice contains complex acoustic markers which have been linked to important health conditions including dementia, mood disorders, and cancer. When viewed as a biomarker, voice is a promising characteristic to measure as it is simple to col…

bridge2ai voice

Published: April 16, 2025. Version: 2.0.0 | DOI: 10.13026/3xt6-rf05