Resources


Database Credentialed Federated

PIFIR: PET-CT Invasive Fungal Infection Reports

Source: Physionet

Surveillance of invasive fungal infection (IFI) in clinical settings is a laborious process requiring a detailed review of patient medical history. One of the key sources of clinical information is imaging reports: radiologist-produced free-text rep…

clinical documentation invasive fungal infections nlp information extraction

Published: Feb. 27, 2025. Version: 1.0.0 | DOI: 10.13026/d51v-j343


Database Open Federated

A Multimodal Dataset for Investigating Working Memory in Presence of Music

Source: Physionet

We present the accompanying dataset to the study "A Multimodal Dataset for Investigating Working Memory in Presenceof Music". The experiment is conducted with the aim of investigating the viability of music as an intervention to regulate c…

Published: Feb. 26, 2025. Version: 1.0.0 | DOI: 10.13026/6vh4-dk68


Database Restricted Federated

OpenOximetry Repository

Source: Physionet

The OpenOximetry Repository is a structured database designed to store clinical and laboratory pulse oximetry data and allows for consolidation of data sets held by collaborating organizations. Matched or independent readings of oxygen saturations, …

Published: Feb. 19, 2025. Version: 1.1.0 | DOI: 10.13026/yq6c-z215


Database Restricted Federated

DREAMT: Dataset for Real-time sleep stage EstimAtion using Multisensor wearable Technology

Source: Physionet

Sleep is an intrinsic part of human life, and recent advancements in wearable technology and machine learning have promised continuous and non-invasive methods of tracking sleep health and patterns, providing an important facet to a more holistic un…

biomedical time series classification wearable sleep disorders

Published: Feb. 5, 2025. Version: 2.0.0 | DOI: 10.13026/0vrv-nn81


Database Restricted Federated

Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Source: Physionet

MIMIC-CXR is a large, open source dataset that is widely-used in medical AI research. One of the limitations of this dataset is the lack of ground truth labels for the chest X-ray studies. Prior work has extracted structured labels from the MIMIC-CX…

mimic-cxr labels

Published: Feb. 4, 2025. Version: 1.0.0 | DOI: 10.13026/7wmp-jx90


Database Restricted Federated

LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays

Source: Physionet

Local annotation of medical data is both expensive and time-consuming due to the high cost of expert annotators, the precision required for accurate annotation, and the inherent challenges of medical diagnosis. To address these problems, we develop…

chest x-ray dataset eye-tracking automatically generated dataset caption-guided object detection localization image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models

Published: Feb. 4, 2025. Version: 1.0.0 | DOI: 10.13026/0pw2-je90


Database Credentialed Federated

MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization

Source: Physionet

This dataset presents a curated collection of preprocessed and labeled clinical notes derived from the MIMIC-IV-Note database. The primary aim of this resource is to facilitate the development and training of machine learning models focused on summa…

natural language processing brief hospital course text summarization machine learning clinical notes

Published: Feb. 3, 2025. Version: 1.2.0 | DOI: 10.13026/5gte-bv70


Database Open Federated

Synthetic Mention Corpora for Disease Entity Recognition and Normalization

Source: Physionet

Named Entity Recognition (NER) and Entity Normalization (EN) are fundamental tasks in information extraction, particularly in the biomedical and clinical domains. NER identifies textual mentions of entities, while EN maps these mentions to unique id…

named entity recognition data augmentation entity normalization machine learning nlp

Published: Feb. 3, 2025. Version: 1.0.0 | DOI: 10.13026/p5pn-ty93


Database Credentialed Federated

Medical-Diff-VQA: A Large-Scale Medical Dataset for Difference Visual Question Answering on Chest X-Ray Images

Source: Physionet

The task of Difference Visual Question Answering involves answering questions about the difference between a pair of main and reference images. This process is consistent with the radiologist's diagnosis practice that compares the current image …

vqa chest x-ray difference visual question answering difference vqa visual question answering

Published: Feb. 3, 2025. Version: 1.0.1 | DOI: 10.13026/e6dd-cn74


Database Restricted Federated

CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation

Source: Physionet

CXRGraph is a dataset of structured radiology reports dataset following the RadGraph format, which has been tailored for the Automatic Radiology Report Generation (ARRG) task. CXRGraph assorts clinical information from full-text radiology reports in…

natural language processing relation extraction named entity recognition information extraction structured radiology report

Published: Feb. 3, 2025. Version: 1.0.0 | DOI: 10.13026/p7kf-t860