Submitting Datasets to the Health Data Nexus

We would love for you to share datasets with the research community on the Health Data Nexus. Some of the benefits of sharing your data include:

  • Broader visibility to researchers, including targeted workshops and courses
  • Cloud-native platform with no data egress possible, giving greater control over data governance
  • Data-use statistics providing useful reporting information for funding agencies
  • A centralized model allowing sharing of developed tools and curated derivations of the data
  • No cost for you; we are funded to cover all storage and platform fees

In order to submit your work, please review the submission guidelines and author guidelines.

Submission Guidelines

If you are planning to create a new de-identified dataset to submit to the Health Data Nexus, you must provide the following to the data steward:

  • A research plan outlining who is involved in the research and its scientific benefit, and
  • Approval from a Research Ethics Board (REB) for the research plan

The data steward can provide you with a research plan template and resources to support REB review, including a list of required clauses describing how the dataset will be hosted.

If you are planning on submitting an existing dataset which has already been de-identified, you must still provide REB approval, including an amendment that adds the clauses describing how the dataset will be hosted.

In addition to REB approval, you must also provide a completed Data Transfer Agreement. For more information, please review the Data Governance page.

Submission Process

The process for submitting a dataset to the Health Data Nexus is as follows:

  • If you are the submitting author, create a new dataset application.
  • Add your dataset details. These will include contact information for co-authors, a description of your project, and uploading of ethical approvals.  
  • Once you are satisfied, submit the project for review.
  • You will be notified of an editorial decision when the review process is complete, which may take 1-2 weeks depending on the content of the dataset.
  • Once a dataset is published, its content is fixed and cannot be changed. Dataset updates, including removal of information, are accomplished by publishing new versions through the same process outlined above.

Additional Submission Instructions

  • Please ensure that you have added all co-authors to the system and have read the Author Guidelines for a detailed description on how to generate project files and related metadata.
  • The structure of the dataset must be clearly described including its content and how the files are organized.
  • Links to relevant supporting publications which describe the data's creation or usage may be provided. The authors may provide links to an external website as supplemental documentation. However, the documentation on the Health Data Nexus must be self-contained and sufficient for others to understand and use your data.

Author Guidelines

Creating the project metadata

To help the research community use your dataset, we require a detailed description. You should provide information on your project and how it can be used. Please include at least a title, abstract, and content description when uploading the files. Further details are provided below.

  • Title: Your title should be no longer than 200 characters. Do not include acronyms and abbreviations, and where possible avoid leading with "The". Only letters, numbers, spaces, underscores, and hyphens are allowed.
  • Abstract: Your abstract must be no longer than 250 words. The focus should be on the dataset being shared. Information relevant to the research that created the dataset should be provided to facilitate use. References should not be included. The abstract should also include a high-level description of the data as well as an overview of the key aims of dataset creation. The abstract may appear in search indexes independently of the full project metadata, so providing detailed information about the content is important.
  • Background: Your background should provide the reader with an introduction to the dataset. The section should offer context in which the dataset was created, outline your motivations for sharing, and highlight potential study questions or areas of interest.
  • Methods: Your description of methods should provide details of the procedures used to create the dataset, including how data was collected, measurement devices, experimental design, data acquisition and processing, etc.
  • Content description: Your dataset description should describe the resource in detail, outlining the structure of files, file formats, and a description of what the files contain. Where appropriate, please include summary statistics (e.g. sample size, time period of the study, etc)
  • Usage notes: These should help researches use of your data. Some helpful notes include previous uses of the data (with citations), known limitations, and any complementary code or datasets that might be of value.
  • Acknowledgements: You can acknowledge people who helped with the research but are not credited co-authors. Include any information on funding.
  • Conflicts of interest: Include a statement on any potential conflicts of interest. If no authors have conflicts of interest, please include: “The author(s) have no conflicts of interest to declare.”
  • Version: Include the version number of the resource. Semantic versioning is encouraged. If unsure, put “1.0.0”.
  • References: Include all resources in Vancouver reference style. Citations should be numbered sequentially in the text in square bracket. Entries in the reference list should be in the following style: 1. Xu YZ, Geng DC, Mao HQ, Zhu XS, Yang HL (2010). "A comparison of the proximal femoral nail antirotation device and dynamic hip screw in the treatment of unstable pertrochanteric fracture". J Int Med Res. 38: 1266–1275. PMID 20925999.
  • Weblinks: Please include all external resources in the References section; do not include any URLs/hyperlinks in the main text.

Preparing project files

Please review the following guidelines before uploading any relevant data and files.

  • README File: Please include a README file alongside the files. It should include at minimum a title and a description of the content.
  • Protected Health Information: All data uploaded MUST be de-identified and have any protected health information removed. Please view the Data Governance page. for more information. The T-CAIREM Data Steward can assist with de-identification as needed.
  • File naming: Files should be clearly named and cannot include spaces or special characters. File names should generally be lowercase, unless they are “special files”. Please keep file names brief.
  • File types: Data should be kept in open-source, machine readable formats. CSVs are a good option for small datasets.