Research using the Precision Medicine Analytics Platform [PMAP]

October 2021


This document is designed for researchers creating IRB applications for studies that propose to use the Precision Medicine Analytics Platform, or PMAP, for analysis or discovery. The document includes guidance on the type of IRB application to use in various circumstances, information to include in your eIRB application and protocol, and template language for consent. For information about PMAP, visit the PMAP portal.

A note about JHM and JHU Data

As indicated in policy (DATAP002), JHM Data is patient- or member-related data stored in JHM clinical enterprise systems. JHM Data includes any reports, or data and any individual-level research results stored in JHM Participating Organizations' clinical, billing, administrative and financial records and systems. JHM Data also includes data streams collected by the Participating Organizations for clinical purposes from patient monitoring devices, whether implanted, wearable, or mobile application-based, and without regard to whether such data streams are identifiable to specific patients.

Access to JHM data is not exclusive to a specific researcher or team of researchers. JHM data may be requested by JHU researchers with an approved IRB protocol.

Johns Hopkins University owns research data that is collected in systems other than clinical enterprise systems. Researchers are the stewards of these data and use them to complete research within their purview as a faculty member. JHU data is not required to be broadly accessible, but there are opportunities for collaboration and scientific advancement that may be realized by sharing of data with other JHU researchers. Consent from research participants may be required to enable such sharing. 

Projects Best Suited for PMAP

Relatively small and uncomplicated analysis projects should use SAFE for data storage and analysis. For more complex projects, PMAP is designed to handle large scale clinical data collection and analysis.  Projects that require high performance CPU/GPU resources require additional configuration and may have additional costs.  Contact the PMAP team for more information.

The following types of projects may be appropriate for PMAP:

  • Projects willing to contribute to and benefit from collaborative annotations
  • Projects involving the creation of a new registry (see definition below), particularly those using data from multiple sources (Requires an eForm R)
  • Projects that intend to migration an existing registry to PMAP  (Requires conversion of the project protocol to a new eForm R)
  • Projects involving large scale data exploration (May require an eform A or eform S depending on the nature of the study – see JHM IRB Forms for information)

Different IRB eforms may be used depending on the project characteristics.

A registry is an organized system that uses observational study methods to collect granular data from a population defined by a particular disease, diagnosis, condition, exposure or experience, and that serves one or more predetermined scientific or clinical purposes. The eForm R was created specifically for describing proposals to create a registry or other research resource. We recommend that Precision Medicine Centers of Excellence (PMCOEs) submit an eForm R umbrella protocol to establish a PMAP resource relevant to their area of study. Departments and other research groups involved with the creation of a resource to enable future research should also utilize this approach. Subsequent protocols proposing to use that resource to answer specific research questions should use one of the other eForm types.

If you are unsure as to whether or not a project is appropriate for PMAP, visit the PMAP Portal for more information or to request a consult.

Researchers intending to utilize PMAP for a new or existing project are also encouraged to contact the IRB office at [email protected] for advice on appropriate submission type/protocol forms in advance of IRB submission.

When to include PMAP staff as members of the study team

PMAP staff should be included as members of the study team in eIRB if they are making substantial contributions to the work and meeting International Committee of Medical Journal Editors criteria for authors. Only those PMAP staff who are considered to be engaged in human subjects research (interacting with participants or their identifiable data for research purposes) should be listed as study team members in the eIRB application. Providing a data projection for a study in general does not meet the level of engagement required to be a member of the study team. Discuss any plans to include PMAP staff on the study team with the staff in question.

Describing Data Needs in your IRB Application

As with any research project, it is important to plan the resources that will be used. Because of the complex regulatory environment around health data, this is particularly important for PMAP projects. Risk can arise from use of clinical data, but can also be increased by new combinations, e.g. joining health record data to data gathered from personal electronic devices.

For each source of data that will be used, reference the source from which it will be obtained and the location where it will be stored during each phase of the project, from acquisition to long-term retention. Linking tables that can be stored in highly secure locations can facilitate de-identification of larger data aggregations; the de-identification process should generally be reviewed and/or overseen by the CCDA.

If any data will be imported to PMAP, the source of those data and the identifiers used to link records will need to be specified.

Data from Clinical Sources

To simplify requests for and review of planned use of clinical data in your research projects, standard data sets (medications, lab results, inpatient and outpatient encounters, recorded diagnoses, and surgical cases) can be selected from the PMAP data catalog. These standard datasets are described in detail in the data catalog.

When completing your IRB application, refer to the standard datasets and identify which datasets will be included in your research, then add any other data required for your study that is not included in those datasets. For example, a study of HIV complications might request the standard labs with the addition of HIV lab results. Specific justification may be needed, particularly if the request includes sensitive data. HIV-, mental health-, and substance use-related elements are considered sensitive data as they can have special protections under HIPAA and other applicable laws. See the section on Sensitive Data for more information.

Note that research uses of clinical data require IRB approval. If a PMAP data projection may be used for both quality improvement and research, the group requesting the projection must seek IRB approval for the development of the projection as a research resource.

Data from Existing Studies

Researchers may wish to combine clinical data with data collected in the course of research for new research purposes. The use of previously collected research data in another study will depend on the original IRB approval constrained by the consent under which those data were obtained.

For data collected under prior research studies, the IRB will need to evaluate the language of the original consent forms related to future sharing to assess whether the proposed activity is permissible.

IRB approval must be obtained for these new uses and Data Trust review may also be required, for example in the event that new use of such combined data presented new institutional risks.

Sensitive Data

Sensitive data, viz. psychotherapy notes, substance abuse data, and HIV data, have special legal protections, and as such, are excluded from standard PMAP projections by default. The following data are classified as sensitive data:

  1. Notes from a mental health professional or substance abuse professional
  2. Mental health, substance abuse, or HIV diagnoses
  3. Mental health, substance abuse, or HIV medication orders (i.e. medications used exclusively for one of the following: mental health disorders, substance abuse disorders, HIV)
  4. HIV test results

You may include sensitive data in your PMAP data projection with appropriate scientific justification articulated in the IRB-approved study protocol or eform. Note that sensitive data must be explicitly described in the protocol and explicitly requested in your data request. In addition, it is important to be aware that what is considered sensitive will evolve over time as new drugs are developed and new uses arise for existing drugs and tests.  The PMAP team will continuously restrict data projections in alignment with the current versions of the definition of sensitive data.

Describing Data Management in your IRB Application

Indicate that PMAP and associated SAFE environment/storage will be used in the protocol when completing your eIRB forms and the associated Risk Tier Calculator (located in Section 36 of the eIRB application). Describe any other environments used as well and indicate how they relate to one another. If data is being shared outside of the institution, provide details on the de-identification process if relevant, who performs de-identification (CCDA is the best practice), and the technical mechanisms for ensuring security in transit. Note best practices. If your PMAP data projection will receive daily updates from the EHR, note that in your IRB application and eform.

Keeping patient level data within PMAP is a best practice. Include the following text in your protocol if applicable:

Patient level data will not leave the PMAP environment. Only aggregate data, such as frequency tables, will leave the PMAP environment.

Note that any movement of data into or from PMAP will require special consideration of the status of consent for the use and sharing of the data. For example, previously-collected data may have restrictions on future use and sharing that must be considered as part of the review process for any proposed future use.

General Consent Considerations

All new studies planning to use PMAP should include the following language in the participant consent in the section titled “What happens to data/biospecimens that are collected during this study?”:

Your data from this research project may be added to a Johns Hopkins Medicine data repository where it may be combined with data from your clinical care or data from other research studies. This secure repository is designed to facilitate future quality improvement and research projects. Any future use will be subject to appropriate oversight and review.

This language enables data to be added to PMAP for use by JHM researchers. Any subsequent studies using data within PMAP will still require separate IRB approval.

Migrating an existing registry

If you are migrating an existing study to PMAP, the access to and sharing of data for that study must be aligned with what is described in the consent. If data sharing or contribution to a larger registry was not anticipated in the consent, then data may not be shared with other study teams. If the consent expressed limitations on how the data would be shared, for example, that it may be shared with other members of the Principal Investigator’s department, those limitations must be reflected in how the data is made available in PMAP. The IRB will review any potential limitations on future use/sharing of existing research data when considering your plans to migrate existing research data to PMAP. Any limitations will be communicated in the IRB approval letter and corresponding restrictions will be added to the data in PMAP, when migrated.

Use Case Specific Considerations

Delivery System Tool Development: Patient/Provider facing applications

In many cases, PMAP will be used to develop a novel tool, such as a predictive model or machine learning algorithm to detect a specific condition. There may be plans to use these tools for quality improvement or broader clinical purposes. The initial work to develop and evaluate the tool should be undertaken as research, as the tool may have general utility and contribute to generalized knowledge. Once the tool has been developed through a research protocol, it may be piloted for quality improvement purposes. Improvements to the tool once it is in production may be considered quality improvement. The IRB should be consulted for any determinations as to whether a project qualifies as research or quality improvement.

Note that any transition of a PMAP project from research to operations requires an initial pilot phase, typically involving a single clinic or ward. Additional reviews and coordination are necessary before a project may be rolled out in operations. To learn more, contact Patrick Ostendarp and Aalok Shah and request a consult with the Technology Innovation Center.

Machine Learning/ Predictive Modeling

Machine learning and predictive modeling are algorithmic techniques to model data and predict outcomes. These techniques are used to look across broad datasets to discover patterns. When doing machine learning or predictive modeling, request broad categories of data, such as the standard lab panel, and explicitly request any sensitive data should that be scientifically justified for your study. In general, a limited data set should be sufficient for machine learning and predictive modeling research.

PMAP offers different resources to support advanced analytic techniques with varying compute resource needs. Visit the PMAP portal to learn more about the different resources available and which is best suited for your study.

If you plan to use an algorithm created outside of JHM, work with the Office of Research Administration and the Data Trust to ensure appropriate legal agreements are in place.

Related information

Data Trust Research Data Subcouncil
Contains information on what research projects require Data Trust review, details about the subcouncil’s review process, and other guidance for researchers.

PMAP Cookbook
A textbook of recipes for analysis on the PMAP platform.  The “recipes” are a navigable collection of computational Jupyter JSON notebooks that cover analysis of EMR data, medical imaging, physiological monitoring waveform, and genomic sequencing. 

IRB Policies

A list of guidelines and policies relevant to human subjects research.

PMAP Data Catalog
Contains information about the data that are available in the Precision Medicine Platform.  The Data Catalog does not show the actual patient data.  Rather, it provides information about the available data to guide subsequent requests for that data. 

PMAP Portal
Information about the Precision Medicine Analytics Platform, describing the advancement that PMAP makes possible, the data available through PMAP, and how it works.

Instructional Information

Instructional Templates

These templates have been created to assist researchers who may be using PMAP in filling out protocol forms for IRB submission.

  • eForm R Template: This instructional template provides guidance and template language that can be used to develop an eForm R protocol for projects that will involve creating a new research resource in the Precision Medicine Analytics Platform (PMAP) platform. The eForm R template should be downloaded from the Forms page of the IRB website. This tool serves as a guide for completing that form.
  • eForm S Template: This instructional template provides guidance and template language that can be used to develop an eForm S protocol for projects that will involve analysis of a data projection from an existing resource protocol [eForm R]. The eForm S template should be downloaded from the Forms page of the IRB website. This tool serves as a guide for completing that form.