Data Governance

This page will help users understand the Data Governance processes in place for the Health Data Nexus, and how we use and protect data as part of this exciting initiative.

Health Data Nexus

The Health Data Nexus is a data environment built to facilitate research and education initiatives. The platform is built on the principles of respecting individuals’ privacy and the security of the information we hold. It also provides researchers with transparent, inclusive, and timely access to de-identified data to facilitate flexible, frictionless and responsible data discovery and analysis.

The Health Data Nexus receives de-identified data from partner institutions, including hospitals, universities, research centres, and private companies, based in Ontario, Canada, and around the world. Some data has already been made publicly available and its holders have given permission to share it on the platform. Other data is newly collected and curated for the Health Data Nexus. The Temerty Centre for Artificial Intelligence Research and Education in Medicine (T-CAIREM) of the University of Toronto, which oversees the Health Data Nexus, is offered no money to put data on the Health Data Nexus, nor does T-CAIREM charge any partner institutions a fee for data sharing and hosting.

Data Governance refers to the processes and policies we have put in place to maximize the effective use and security of data. We take a layered approach to protecting the data we hold, and have established a Data Governance Proposal, Data Transfer Agreement, Code of Conduct, Data Use Agreement, Data Governance Committee Terms of Reference, and Deidentification Policy. The governance and use of data on the Health Data Nexus platform has also been independently evaluated in a Privacy Impact Assessment (PIA) conducted by external legal counsel as well as a Threat Risk Assessment (TRA) conducted by an outside information security firm. For a copy of the PIA report, a summary of the TRA report, or any of the Health Data Nexus governing documents, please contact the Data Steward at

Personal Health Information (PHI)

PHI is defined under PHIPA (Ontario’s Personal Health Information Protection Act, 2004) as identifying information about an individual that relates to their physical or mental health, including diagnoses, testing, treatments, demographic information, and payment information related to health care.

The Health Data Nexus does not allow PHI to be stored on the platform. Data hosted on the Health Data Nexus is provided to us in de-identified form, according to strict protocols. This means that any information which could foreseeably—either alone or with other information—be used to identify any individual has been removed prior to transfer, in compliance with the De-identification Policy; it is then reviewed by the Data Steward. If in the rare event PHI is missed or is inadvertently placed on the platform, the Health Data Nexus has processes in place to immediately flag and delete this information. These protections are detailed in the Data Transfer Agreement, Code of Conduct, and Data Use Agreement.

Research Ethics Board (REB)

An REB ensures that research meets ethical standards to protect human participants in research, in compliance with PHIPA and the Tri-Council Policy Statement (TCPS2): Ethical Conduct for Research Involving Humans. Datasets that are not already publicly available include a research plan describing the creation of a deidentified dataset—this research plan is approved by an appropriate REB, either through the University of Toronto or as affiliated with the data holder. If the research plan was approved outside of the University of Toronto, it must be submitted for administrative review to the University of Toronto’s Human Research Ethics Unit (HREU).

Datasets that are already publicly available and for which no research plan exists (e.g., if the dataset was created prior to enactment of TCPS2) are reviewed by U of T’s HREU.

Using and Sharing Data

De-identified data on the Health Data Nexus is only available to credentialed users who have signed the Code of Conduct, completed specific research training, and signed the appropriate Data Use Agreement. Researchers can be access the de-identified data only in a cloud-based analytics environment; technical and legal safeguards prevent data from being download or removed from the platform.

In addition, the Health Data Nexus includes three zones of data access, each of which is subject to the safeguards set out above:

Zone 1 requires user credentialing, appropriate research training, and a signed Data Use Agreement. This is the baseline zone of access on the platform.

Zone 2 includes Zone 1 requirements, and additionally, requires the submission of a research plan to be approved by the data holder, allowing for greater control over research using the data.

Zone 3 includes the same requirements as Zone 2 and additionally requires that the submitted research plan be approved by an REB. The zoned data access on the platform provides additional protection for data holders who wish to provide potentially sensitive data.

The Health Data Nexus does not share data with, nor sell data to, third parties. Research using data on the Health Data Nexus cannot be for commercial purposes. Any intellectual property resulting from research using Health Data Nexus data must be negotiated directly between the data user and the data holder, and will not be facilitated by T-CAIREM nor the Health Data Nexus.

Data Governance Committee

In addition to the policy documents mentioned above, the Health Data Nexus is also governed by a Data Governance Committee, which reviews the appropriateness of newly acquired datasets for placement on the platform and to provide oversight to any new developments that engage the platform. To ensure that a diverse set of voices are heard, the Data Governance Committee includes patient representatives, members with ethical and legal backgrounds, and student and trainee representatives.

For any questions or concerns about data governance, use, and security, please contact the Data Steward at


Each dataset on the platform is uploaded under its own license, in this standard form:

The Health Data Nexus Contributor Review Health Data License
Version 1.0

Copyright (c) 2022 Temerty Centre for Artificial Intelligence Research and Education in Medicine

The Temerty Centre for Artificial Intelligence Research and Education in Medicine (T-CAIREM) wishes to make data available for research and educational purposes to qualified requestors, but only if the data are used and protected in accordance with the terms and conditions stated in this License.

It is hereby agreed between the data requestor, hereinafter referred to as the "LICENSEE", and T-CAIREM, that:

  1. The LICENSEE will not attempt to identify any individual or institution referenced in Health Data Nexus restricted data.
  2. The LICENSEE will exercise all reasonable and prudent care to avoid disclosure of the identity of any individual or institution referenced in Health Data Nexus restricted data in any publication or other communication.
  3. The LICENSEE will not share access to Health Data Nexus restricted data with anyone else.
  4. The LICENSEE will exercise all reasonable and prudent care to maintain the physical and electronic security of Health Data Nexus restricted data.
  5. If the LICENSEE finds information within Health Data Nexus restricted data that they believe might permit identification of any individual or institution, the LICENSEE will report the location of this information promptly by email to, citing the location of the specific information in question.
  6. The LICENSEE will use the data for the sole purpose of lawful use in scientific research and no other.
  7. The LICENSEE will be responsible for ensuring that they maintain up to date certification in human research subject protection
  8. The LICENSEE agrees to contribute code associated with publications arising from this data to a repository that is open to the research community.
  9. This agreement may be terminated by either party at any time, but the LICENSEE's obligations with respect to Health Data Nexus data shall continue after termination.  


Privacy Policy

Researchers are also subject to the following Privacy Policy:

When you use the Health Data Nexus, we will require information about you that is necessary to provide you with computational resources, functions, and other services (heretofore “Services”) to enable your activities. Information necessary for your use of the Health Data Nexus will also be generated and maintained as you use the Health Data Nexus. Together, the information that you provide and the information about your use of the Health Data Nexus are your User Information.

Your User Information is employed to:

  1. confirm your identity on login;
  2. confirm your privileges and provide the Services to you; and
  3. optimize system resources, security, and your experience with the Services.

Your User Information will be shared on a need-to-know basis with officials at T-CAIREM, at the University, partners, and service providers as necessary to provide you with the Services, and to support and maintain security.

For example, we collect information about your use including the Internet Protocol (IP) address from which you are connecting, and we track your use and actions in the Health Data Nexus to understand and optimize system resources, to improve the quality of your experience with the Services, and for security as noted above.

We do not trade or sell your User Information, nor do we share it without your consent, except as provided above or as required by law or University policy.

Your User Information is protected consistent with all privacy requirements that may apply to it, such as the Freedom of Information and Protection of Privacy Act.

We welcome your questions about your User Information or the Services, and invite you to contact:

Health Data Nexus Data Steward