Many funding agencies, including the NIH, as well as publishers, now require investigators to state how they plan to make their data available for use by others, including the larger research community and the broader public. In many cases, this is done by submitting data to an established repository and having a persistent identifier such as a digital object identifier (DOI) assigned to the data. Refer to the guidance below and submit a consultation request if you’d like additional assistance.
Data curation for sharing
Preparing data for sharing is an important part of the sharing process, as sharing data is not just about making data available. It’s also important to compile metadata, or data about the data, as well as any other supporting information that is needed to understand the data (e.g. data dictionaries, study protocols, data collection forms, README files). The NIH and other funders stress the importance of sharing data according to the FAIR Principles (Findable, Accessible, Interoperable, and Reusable) to maximize the benefits of sharing. Many repositories have requirements and guidelines around metadata and associated data documentation to help you share your data with the FAIR Principles in mind.
To learn more about how to implement the FAIR Principles, please visit the GO FAIR website.
Licenses for data
It’s best to choose a license for your dataset at the time of sharing to ensure you are following the FAIR Principles. The license will stipulate how others may use your data and indicate any requirements for attribution. Some domain-specific repositories will require a particular license, so refer to the repository documentation when selecting a license. Creative Commons licenses are commonly used, and there is a tool to help you select the license that is right for you when a specific license is not required.
Selecting a data repository
Determining the best repository for sharing your data can be challenging at times, but there are resources to help you get started. For example, when choosing a repository to manage and share data resulting from Federally funded research, please check out the Desirable Characteristics of Data Repositories on the NIH sharing site based on the NSTC (National Science and Technology Council) report . The following repositories meet these characteristics and are recommended for storing your data to meet the funder policy and publisher’s requirements for data availability statements
NIH-supported domain-specific repositories
The NIH offers several domain-specific repositories that are associated with specific institute and center operations. Submitting data to an NIH domain-specific repository may be a requirement for certain ICOs or funding opportunity announcements. To look for a repository designated by the NIH IC-specific sharing policies, please visit here. Check with your program officer to see if you will be required to submit your data to an NIH domain-specific repository.
Additional domain-specific repositories
If there is no NIH domain-specific repository that works for your data, you can search for additional domain-specific repositories on the Registry of Research Data Repositories (re3data.org). Browse by subject and click the descriptive words that match your data. Re3data will provide a list of repositories that may work for your data. Scientific Data also provides a variety of tables listing domain-specific repositories in the biological sciences, health sciences and more.
If no domain-specific repository exists, investigators have the option to deposit data in the School of Medicine’s institutional repository, Digital Commons@Becker.
Submit the Data Management and Sharing Consultation Form to start the process.
Human participant data considerations
Before sharing data from human participants, it’s important to make sure sharing is allowed based on the informed consent form that was used to enroll participants in the study. Contact the Human Research Protection Office (HRPO) to verify you are allowed to share the data from your study and by which means. For additional information on reviewing data sharing language in informed consent forms, please review the Curation of Data Collected by Informed Consent Primer from the Data Curation Network. Restrictions around data access are often associated with sharing data from human studies, and many repositories have restricted access capabilities to accommodate this situation. In some cases, a Data Use Agreement is needed. Check out the Data Use Agreement Intake Form from the Joint Research Office for Contracts within the Office of the Vice Chancellor for Research to begin the process.
De-identification is another important aspect to sharing human participant data. Refer to our Human Participant Data Consideration page to help you get started.
Where to submit genomic data
- Submitting Human Genomic Data
- Examples of Frequently Used Repositories for Human Genomic Data
- Submitting Non-Human Genomic Data
- Examples of Frequently Used Repositories for Non-Human Genomic Data
How to register and submit a study in dbGaP
All large-scale human genomic studies funded by NIH must register in dbGaP, an NIH repository for human genomic data. This page provides step-by-step instructions for registering an NIH-funded study in dbGaP.
Investigators working with large-scale human genomic data are required to submit an Institutional Certification to NIH. Learn about this important document and how to prepare it.
- About Institutional Certifications
- Completing an Institutional Certification Form: Step-by-Step instructions for completing an Institutional Certification form.
The Institute of Clinical and Translational Sciences (ICTS) at Washington University also offers guidance and resources for genomic data sharing on the ICTS Precision Health website. Resources include guidance on consenting participants for genomic data sharing as well as databases where you can share genomic data.
The Case for Open Data
For more information on data sharing, check out the slides from Becker Library’s Data Sharing and FAIRness workshop.