Resources
Database Restricted Federated
LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays
Local annotation of medical data is both expensive and time-consuming due to the high cost of expert annotators, the precision required for accurate annotation, and the inherent challenges of medical diagnosis. To address these problems, we develop…
chest x-ray dataset eye-tracking automatically generated dataset caption-guided object detection localization image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models
Published: Feb. 4, 2025. Version: 1.0.0 | DOI: 10.13026/0pw2-je90
Database Credentialed Federated
MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization
This dataset presents a curated collection of preprocessed and labeled clinical notes derived from the MIMIC-IV-Note database. The primary aim of this resource is to facilitate the development and training of machine learning models focused on summa…
natural language processing brief hospital course text summarization machine learning clinical notes
Published: Feb. 3, 2025. Version: 1.2.0 | DOI: 10.13026/5gte-bv70
Database Open Federated
Synthetic Mention Corpora for Disease Entity Recognition and Normalization
Named Entity Recognition (NER) and Entity Normalization (EN) are fundamental tasks in information extraction, particularly in the biomedical and clinical domains. NER identifies textual mentions of entities, while EN maps these mentions to unique id…
named entity recognition data augmentation entity normalization machine learning nlp
Published: Feb. 3, 2025. Version: 1.0.0 | DOI: 10.13026/p5pn-ty93
Database Credentialed Federated
Medical-Diff-VQA: A Large-Scale Medical Dataset for Difference Visual Question Answering on Chest X-Ray Images
The task of Difference Visual Question Answering involves answering questions about the difference between a pair of main and reference images. This process is consistent with the radiologist's diagnosis practice that compares the current image …
vqa chest x-ray difference visual question answering difference vqa visual question answering
Published: Feb. 3, 2025. Version: 1.0.1 | DOI: 10.13026/e6dd-cn74
Database Restricted Federated
CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation
CXRGraph is a dataset of structured radiology reports dataset following the RadGraph format, which has been tailored for the Automatic Radiology Report Generation (ARRG) task. CXRGraph assorts clinical information from full-text radiology reports in…
natural language processing relation extraction named entity recognition information extraction structured radiology report
Published: Feb. 3, 2025. Version: 1.0.0 | DOI: 10.13026/p7kf-t860
Database Credentialed Federated
Symile-MIMIC: a multimodal clinical dataset of chest X-rays, electrocardiograms, and blood labs from MIMIC-IV
Symile-MIMIC is a multimodal clinical dataset derived from MIMIC-IV and MIMIC-CXR, consisting of chest X-rays (CXRs), electrocardiograms (ECGs), and blood laboratory tests. It was developed to evaluate Symile, a contrastive learning objective design…
contrastive learning model cxr multimodal chest x-ray mimic ecg electrocardiogram database
Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/3vvj-s428
Database Restricted Federated
Visual Question Answering evaluation dataset for MIMIC CXR
MIMIC CXR [1] is a large publicly available dataset of chest radiographs in DICOM format with free-text radiology reports. In addition, labels for the presence of 12 different chest-related pathologies, as well as of any support devices, and overall…
Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/cvsk-ny21
Database Open Federated
CGMacros: a scientific dataset for personalized nutrition and diet monitoring
We present CGMacros, a dataset containing multimodal information from two continuous glucose monitors (CGM), food macronutrients, food photographs, and physical activity, in addition to anonymized participant demographics, anthropometric measurement…
diabetes continuous glucose monitors obesity machine learning postprandial glucose response food macronutrients metabolic models food photographs personalized nutrition
Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/3z8q-x658
Database Open Federated
Minute level step counts and physical activity data from the National Health and Nutrition Examination Survey (NHANES) 2011-2014
The National Health and Nutrition Examination Survey (NHANES) is a nationally representative study that collects demographic, socioeconomic, dietary, and health-related information from 10,000 Americans annually. In Wave G (2011-2012) and Wave H (20…
nhanes accelerometry steps physical activity
Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/ah0j-3z47
Database Credentialed Federated
SCRIPT X2B8 Dataset: per-day clinical features to model successful next-day extubation
Criteria to identify patients who are ready to be liberated from mechanical ventilation are imprecise, often resulting in prolonged mechanical ventilation or reintubation, both of which are associated with adverse outcomes. We sought to determine wh…
Published: Jan. 28, 2025. Version: 1.0.0 | DOI: 10.13026/235w-zn26