Research Data Management

Data has always been central to the research. Since research data is a valuable asset, data management has been high on the agenda of governments, funders, and publishers in the context of Open Science. Good practices for managing research data have been developed to make data Findable, Accessible, Interoperable, and Reusable (FAIR).

Contemporary progress allows researchers to produce more data than ever before, but also provides tools to manage the data. Research Data Management (RDM) includes planning, collecting, processing, storing, accessing, and reusing data. RDM is simply about proper data management practices before, during, and after research.

Research data is the evidence that supports research. The examples of data include (but are not limited to) the following: Documents, spreadsheets, questionnaires, transcripts, codebooks, models, algorithms, scripts, software application, photographs, films, audio/videotapes, video, audio, text, or image contents, protocols, diaries, field notebooks, eLANs, specimens, samples, artifacts, etc.

Research data can be generated or reused using different processes, hence their source may be: Observations (survey data), experiments (gene sequences, social experiments), simulations (economic models), compilations (database, text mining), referencing (datasets, databanks, papers), etc.

Research data comes in many forms: Text (RTF, PDF, XML), numerical (Excel, SPSS, Stata), multimedia (jpeg, mpg), software (C, Java), etc.

Thinking about research data begins before the research has started - at the moment you decide what data will be generated and/or what data (if any) will be reused. That is when RDM starts as well.

You should care! Data management practices lead to better quality and integrity of research, a greater impact of research, and greater visibility and reuse of research data all of which is the essence of Open Science. RDM is also an essential part of good research practice which is beneficial for researchers, their research institute, and the wider research community as it:

  • Prevents data loss and data corruption;
  • Makes the research process run smoother;
  • Enables to know which versions are most up-to-date;
  • Ensures finding and (re)using data again later;
  • Facilitates the sharing and reuse of data for future researchers, to develop new research;
  • Allows knowing which services are needed for research (e.g. ethical and/or legal advice);
  • Prevents fraud or bad science.