Resources


Database Credentialed Federated

Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information

Source: Physionet

The human voice contains complex acoustic markers which have been linked to important health conditions including dementia, mood disorders, and cancer. When viewed as a biomarker, voice is a promising characteristic to measure as it is simple to col…

bridge2ai voice

Published: Dec. 16, 2025. Version: 3.0.0 | DOI: 10.13026/k81f-qr68


Database Credentialed Federated

MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context

Source: Physionet

Large Vision Language Models (LVLMs) have recently achieved superior performance in various tasks on natural image and text data, which inspires a large amount of studies for LVLMs fine-tuning and training. Despite their advancements, there has been…

Published: Dec. 10, 2025. Version: 1.0.1 | DOI: 10.13026/0qtp-3d10


Database Restricted Federated

Microbiological, Immunological and Biochemical Characteristics of the Development of Ventilator Associated Pneumonia

Source: Physionet

The respiratory microbiome plays a critical role in metabolism, immune system maturation, and protection against pathogens. Traditionally, respiratory microbiology in pneumonia focused on identifying a specific pathogen, often disregarding normal or…

Published: Dec. 5, 2025. Version: 1.1.1 | DOI: 10.13026/rtc6-cq72


Database Credentialed Federated

Antibiotic Resistance Microbiology Dataset Mass General Brigham (ARMD-MGB)

Source: Physionet

The Antibiotic Resistance Microbiology Dataset – MGB (ARMD-MGB) is a de-identified resource derived from electronic health records (EHR) that facilitates research in antimicrobial resistance (AMR). ARMD-MGB encompasses data collected from over…

antimicrobial resistance electronic health records medical informatics

Published: Dec. 5, 2025. Version: 1.0.0 | DOI: 10.13026/2r5k-b955


Database Credentialed Federated

EchoGraph-annotated ECHO-NOTE2NUM examples

Source: Physionet

This repository releases the EchoGraph-annotated ECHO-NOTE2NUM dataset, containing 45,794 echocardiography reports with comprehensive entity and relation annotations. Each report from the ECHO-NOTE2NUM dataset has been automatically annotated using …

Published: Dec. 4, 2025. Version: 1.0.0 | DOI: 10.13026/hb5q-9532


Database Contributor Review Federated

InReDD-Dataset-PAN924

Source: Physionet

InReDD-Dataset-PAN924 is a collection of 924 radiographic images annotated with mouth and teeth labels by specialists from the InReDD research group.

InReDD (Interdisciplinary Research Group in Digital Dentistry) is a collaborative research initiati…

Published: Nov. 23, 2025. Version: 1.0.0 | DOI: 10.13026/r5nt-we67


Database Credentialed Federated

MedVAL-Bench: Expert-Annotated Medical Text Validation Benchmark

Source: Physionet

MedVAL-Bench is a dataset containing physician evaluations of errors in language model (LM)-generated medical text. The dataset spans 6 diverse medical text generation tasks and includes annotations from 12 physicians on clinically significant error…

Published: Nov. 14, 2025. Version: 1.0.1 | DOI: 10.13026/653w-3038


Database Credentialed Federated

Predictors of Hospital Onset Infection: A Matched Retrospective Cohort Dataset

Source: Physionet

This repository contains a de-identified and curated patient-level dataset for modeling the impact of fine-grained environmental and patient-level factors on nosocomial acquisition of a wide range of drug-susceptible and drug-resistant pathogens. Th…

infection control clinical machine learning infectious diseases electronic health records hospital onset infection colonization pressure

Published: Nov. 4, 2025. Version: 1.0.0 | DOI: 10.13026/k70x-0m81


Database Credentialed Federated

MedVAL-Bench: Expert-Annotated Medical Text Validation Benchmark

Source: Physionet

MedVAL-Bench is a dataset containing physician evaluations of errors in language model (LM)-generated medical text. The dataset spans 6 diverse medical text generation tasks and includes annotations from 12 physicians on clinically significant error…

Published: Nov. 4, 2025. Version: 1.0.0 | DOI: 10.13026/8ga5-6661


Database Open Federated

HeartCycle: A comprehensive dataset of synchronized impedance cardiography and echocardiography for accurate hemodynamic predictions

Source: Physionet

The "HeartCycle" dataset offers a comprehensive collection of synchronized impedance cardiography (ICG) and echocardiography (ECHO) signals, supplemented with finger photoplethysmography (PPG), heart sounds, and electrocardiography (ECG) data from 1…

cardiovascular physiology electrophysiological study echocardiography machine learning impedance cardiography

Published: Nov. 3, 2025. Version: 1.0.0 | DOI: 10.13026/z865-eb23