Research Data Management (RDM)

Handling data is essential for the scientific research process. Data is generated, processed, and utilized in various forms in research. With the increasing volume of data and the opportunities for reuse brought about by digitalization, the importance of managing research data is also growing.

What are research data?

Research data can be defined as any data generated in the context of scientific research activities. Due to the diverse range of scientific methods, disciplines, and research interests, there exists a wide variety of different types of research data, most of which are now digital. In research, data are created, processed, and utilized in forms such as measurement data, laboratory values, survey data, interviews, texts, or audiovisual documents. Even testing procedures, such as simulations or questionnaires, can be included in this category.

Publish your research data!

HAW Hamburg has established REPOSIT (https://reposit.haw-hamburg.de/), an institutional repository for the publication and long-term storage of research data. All datasets in REPOSIT are assigned persistent identifiers (DOI, Handle) and enriched with metadata. When publishing your data, you should regulate its reuse by assigning a license. In REPOSIT, you can choose between various licenses such as Creative Commons or the GNU General Public License. Since data publication is currently in beta mode, individual activation is required. Please feel free to contact us at hibs.oa (at) haw-hamburg (dot) de for this purpose. The maximum upload size is generally 512 MB per file, but larger files can be accommodated upon request.

For the direct exchange of data within scientific communities, discipline-specific repositories are also suitable for data publication. A comprehensive selection of recognized repositories, such as Pangaea for Earth & Environmental Science, is available in the Registry of Research Data Repositories (re3data.org), categorized by discipline, licenses, access policies, and many other criteria.

Additionally, several large generic repositories have been established worldwide, open to researchers from all disciplines. The most well-known include Harvard Dataverse, Zenodo (operated by CERN), the European platform B2Share (founded as part of an EU Horizon 2020 project), OSF Home as part of the comprehensive Open Science Framework, and Dryad. These repositories offer free licenses, persistent identifiers like DOI for published data, metadata descriptions, data versioning, storage capacities exceeding 50 GB in some cases, and many other features free of charge.

Why Research Data Management Matters

The goal of research data management is to develop methods, procedures, and strategies that ensure the systematic handling of research data and secure their sustainable usability. From data creation to reuse, research data management supports your scientific research throughout the data lifecycle – from planning, creation, and preparation of data to data analysis, as well as addressing questions related to archiving, securing, publishing, sharing, and reusing data.

Research data management supports:

  • The fundamental organization of the research process: Proactive planning of data handling in research projects is an essential part of scientific practice.
  • Compliance with funders’ requirements: Many funding agencies consider robust research data management a prerequisite for funding.
  • Data reusability: Metadata and licenses enable other researchers to reuse data.
  • Transparency and traceability of research processes: Documenting data and analysis steps ensures research traceability and forms the foundation for reproducibility of results.
  • Increasing the visibility of research and creating new collaboration opportunities: Published data, in particular, can enhance the visibility of your research projects and initiate collaborations.
  • Citeability of data: By publishing your data as standalone publications or supplements and assigning persistent identifiers like DOIs, your research data becomes citable and permanently referable.

FAIR – Shaping sustainable research!

With the four fundamental FAIR Data Principles, FORCE11, an international coalition of individuals from scientific research, libraries, archives, publishers, and research funding, has formulated key requirements for research data that enable sustainable use. The German Research Foundation (DFG) also explicitly points out in its "Guidelines for Safeguarding Good Scientific Practice" that access to research data should comply with the FAIR principles. These four requirements for research data are:

  • Findable: Persistent and globally unique identifiers (e.g., DOI) and extensive metadata ensure optimal findability and citability of research data.
  • Accessible: Access to research data and metadata should be simple and possible using an open, free, and machine-readable protocol.
  • Interoperable: To link research data in a machine-readable way over the long term, data must be comparable, and metadata should be based on controlled vocabularies, classifications, etc.
  • Reusable: Research data and metadata should be comprehensively described, documented, and clearly and legally licensed to ensure their reuse.

The complete FAIR principles were published in Scientific Data in 2016.

Data Management Plans

Data Management Plans (DMP) are an important tool in the context of research data management. They document the handling of research data within research projects and should ideally be created before the actual research process begins. Due to the mandatory requirements for managing research data from funding agencies such as the DFG or the EU, DMPs are becoming increasingly common.

There are many templates and questionnaires available for creating DMPs, which guide researchers through the relevant topics of data management in a structured way. Typical questions include:

  • What data will be collected and used?
  • Which software will be used for data collection or generation?
  • Where will the data be stored during the research process?
  • How will the data be protected?
  • In which formats (e.g., CSV, PDF, etc.) will the data be archived?
  • What metadata will be assigned to make your data identifiable?
  • How will the legal requirements for protecting personal data be met?
  • Where and for what duration will the data be archived?
  • How will reuse be regulated, and what licenses are planned?

Assistance and templates for Data Management Plans