The VBN Team

Open Data

Open Science also includes open, or partly open, access to research data. This may not only contribute to ensure maximization of data potential - e.g. through other researchers’ use, but also to strengthening your profile.

Read more about how to register data sets in VBN/Pure.

Opportunity or requirement

The demand or wish for open access to your research data, or parts of these, can come from many different quarters. It may be required by the funder - e.g. Open Data Pilot in Horizon 2020. Or it may be a requirement from the publisher that you publish the underlying data. Notice that in the latter case there may be special demands regarding platforms (repositories).

 

Get the contractual basis in place

If you are working with others on a project, whether it is fellow AAU staff, collaboration partners from other universities, or commercial companies, the decision regarding publication of data should be made as early as possible to avoid any disputes. It can be done by contract, or by drawing up - and agreeing upon - a data management plan.

Be clear on who is responsible for publishing data, including maintenance of contact information on datasets, should someone like to know more about accessible data.

 

Choice of platform

You can choose between different types of platforms when you are going to publish your data. In you choice of platform, you should consider whether there are requirements that may dictate your choice, and how you take measures to secure the best possible recycling of your data.

Furthermore, you should pay attention to whether you hand over some of your rights to your data in connection with your choice of repository. Your data may be placed behind a paywall thus preventing direct access

In general, you can choose between the following:

  • Specialized repositories:
    Some academic societies, or other groups, handle specialized repositories for specific areas of research. This offers the advantage of grouping datasets logically within the individual sciences.
  • Generic repositories:
    Some repositories have no requirements regarding types of dataset, but handle datasets across disciplines.
  • Institutional repositories:
    Some universities have their own repository for data that often works on the basis of the same principles as the generic repositories.
  • Publisher repositories:
    Some publishing houses have their own repositories, and in some cases demand that you use these when you publish an article submitted to, and accepted, by their company.

Remember, you cannot be certain that you secure maximum use and potential when putting all of your data in the same repository. Different datasets may go in different repositories, some may even be split up or combined, in order to fit into specific repositories.

You can get a good overview of these repositories at re3data.org

 

Choice of license

If a dataset is made publicly available, it is automatically protected by copyright with regard to other peoples’ use of the data. As such, it is important that when you make your data available to others is should always be accompanied by a licence that states the conditions for further use. When you select a platform, you should always check out which license they support. Most of them offer Creative Commons, and in some cases other standards as well.

Take notice that you dataset may contain elements from other datasets that prevent you from publishing it. For example, if you are using pictures for analysis that may be protected by copyright or a license that does not permit sharing. In such cases, it may be possible to split your dataset and publish the parts you are permitted to publish.

 

FAIR thinking

The term FAIR is an abbreviation of Findable, Accessible, Interoperable and Reusable. It is not a standard for quality, but a number of principles for securing maximum potential for reuse of your data. In this connection, it should be noted that FAIR not necessarily aims at free access to data.

Public use (manual search and evaluation of datasets) is not the only focus of FAIR. FAIR has an equally important focus on how computers via algorithms etc. can retrieve, collect and use datasets without human interference. FAIR has a particularly strong focus on securing uniformity and precision in data.

The FAIR principles are implemented differently depending on research practice. What is FAIR in one research group may not be FAIR in another.

Briefly, the principles concern:

Findable

  • Make sure that your data is retrievable by recording it in an acknowledged and searchable repository, that you are awarded a persistent identifier (e.g. DOI) for reference, and that you use metadata that ensures the best conditions for data retrieval.

Accessible

  • Data must be accessible according to standard protocols (e.g. Internet protocol HTTP), comply with standard options for login if the data is protected. However, it may also concern correct and easily stated contact information regarding how to access the data. Furthermore, if the data are to be deleted, there must be metadata that describes that the data was previously accessible and the reasons for making them inaccessible.

Interoperable

  • Data must be interoperable which means that is should be easy to understand and combine your data with other datasets. This is why both data and metadata use representation standards - e.g. using shared and acknowledged vocabularies, taxonomies etc. that are clearly defined. It may also include a shared standard for naming - e.g. Au instead of the words "Gold", "Oro" etc., if you are working with chemical substances. Likewise, use of measuring scales like celsius and fahrenheit should be agreed upon and stated.

Reusable

  • In order for data to be reused by others (both man and machine), it is relevant that there is a sufficient amount of information to estimate the context for collecting data. Therefore, there must be a full standardized description of the data context that conforms with best practice within this particular academic field. To this must be added a clear indication of license terms and agreements for further use, a description of who is responsible for the data, and its origin.

It may require some time to adapt and implement the FAIR principles into your own research practice. Complying with the FAIR principles is not a way to finish working, but a way to begin. The use of standardized metadata etc. should be taken into consideration from the moment the data are generated to avoid a time consuming conversion task towards the end of the project.

Your can read more about the FAIR principles at FORCE11

Contact

If you have questions regarding specific project, or would like to know more about the publication of datasets, please contact the VBN Team. Research data is handle by CLAAUDIA – Research Data Service, through which the library and IT Services cooperate as advisors.