Resources


Database Restricted Federated

DREAMT: Dataset for Real-time sleep stage EstimAtion using Multisensor wearable Technology

Source: Physionet

Sleep is an intrinsic part of human life, and recent advancements in wearable technology and machine learning have promised continuous and non-invasive methods of tracking sleep health and patterns, providing an important facet to a more holistic un…

biomedical time series classification wearable sleep disorders

Published: Feb. 5, 2025. Version: 2.0.0 | DOI: 10.13026/0vrv-nn81


Database Restricted Federated

LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays

Source: Physionet

Local annotation of medical data is both expensive and time-consuming due to the high cost of expert annotators, the precision required for accurate annotation, and the inherent challenges of medical diagnosis. To address these problems, we develop…

chest x-ray dataset eye-tracking automatically generated dataset caption-guided object detection localization image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models

Published: Feb. 4, 2025. Version: 1.0.0 | DOI: 10.13026/0pw2-je90


Database Restricted Federated

Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Source: Physionet

MIMIC-CXR is a large, open source dataset that is widely-used in medical AI research. One of the limitations of this dataset is the lack of ground truth labels for the chest X-ray studies. Prior work has extracted structured labels from the MIMIC-CX…

mimic-cxr labels

Published: Feb. 4, 2025. Version: 1.0.0 | DOI: 10.13026/7wmp-jx90


Database Credentialed Federated

Medical-Diff-VQA: A Large-Scale Medical Dataset for Difference Visual Question Answering on Chest X-Ray Images

Source: Physionet

The task of Difference Visual Question Answering involves answering questions about the difference between a pair of main and reference images. This process is consistent with the radiologist's diagnosis practice that compares the current image …

vqa chest x-ray difference visual question answering difference vqa visual question answering

Published: Feb. 3, 2025. Version: 1.0.1 | DOI: 10.13026/e6dd-cn74


Database Credentialed Federated

MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization

Source: Physionet

This dataset presents a curated collection of preprocessed and labeled clinical notes derived from the MIMIC-IV-Note database. The primary aim of this resource is to facilitate the development and training of machine learning models focused on summa…

natural language processing brief hospital course text summarization machine learning clinical notes

Published: Feb. 3, 2025. Version: 1.2.0 | DOI: 10.13026/5gte-bv70


Database Open Federated

Synthetic Mention Corpora for Disease Entity Recognition and Normalization

Source: Physionet

Named Entity Recognition (NER) and Entity Normalization (EN) are fundamental tasks in information extraction, particularly in the biomedical and clinical domains. NER identifies textual mentions of entities, while EN maps these mentions to unique id…

named entity recognition data augmentation entity normalization machine learning nlp

Published: Feb. 3, 2025. Version: 1.0.0 | DOI: 10.13026/p5pn-ty93


Database Restricted Federated

CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation

Source: Physionet

CXRGraph is a dataset of structured radiology reports dataset following the RadGraph format, which has been tailored for the Automatic Radiology Report Generation (ARRG) task. CXRGraph assorts clinical information from full-text radiology reports in…

natural language processing relation extraction named entity recognition information extraction structured radiology report

Published: Feb. 3, 2025. Version: 1.0.0 | DOI: 10.13026/p7kf-t860


Database Credentialed Federated

Symile-MIMIC: a multimodal clinical dataset of chest X-rays, electrocardiograms, and blood labs from MIMIC-IV

Source: Physionet

Symile-MIMIC is a multimodal clinical dataset derived from MIMIC-IV and MIMIC-CXR, consisting of chest X-rays (CXRs), electrocardiograms (ECGs), and blood laboratory tests. It was developed to evaluate Symile, a contrastive learning objective design…

contrastive learning model cxr multimodal chest x-ray mimic ecg electrocardiogram database

Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/3vvj-s428


Database Restricted Federated

Visual Question Answering evaluation dataset for MIMIC CXR

Source: Physionet

MIMIC CXR [1] is a large publicly available dataset of chest radiographs in DICOM format with free-text radiology reports. In addition, labels for the presence of 12 different chest-related pathologies, as well as of any support devices, and overall…

Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/cvsk-ny21


Database Open Federated

CGMacros: a scientific dataset for personalized nutrition and diet monitoring

Source: Physionet

We present CGMacros, a dataset containing multimodal information from two continuous glucose monitors (CGM), food macronutrients, food photographs, and physical activity, in addition to anonymized participant demographics, anthropometric measurement…

diabetes continuous glucose monitors obesity machine learning postprandial glucose response food macronutrients metabolic models food photographs personalized nutrition

Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/3z8q-x658