The CNRS's ambitious plan for accessible and reusable data


Following the CNRS's 'Roadmap for Open Science' published in 2019, the organisation's 'Research Data' plan encourages researchers to make their data accessible and reusable. Alain Schuhl, the CNRS Deputy CEO for Science, gives details of the new plan.

Why is opening up research data so important?
Alain Schuhl:
Making the data linked to a scientific publication available is essential in terms of understanding, reproducing and validating scientific results. When such data are shared other teams can re-use them and don't need to generate them again, thus saving time and using research funds better. Above all, new knowledge can emerge from the cross-fertilisation of data from very diverse communities as long as these are disseminated with high levels of quality and contextualisation. This is why it's important to make data 'FAIR' – Findable, Accessible, Interoperable and Reusable –right from the design stage of a research project.

So the CNRS has just implemented its 'Research Data' plan. What are its ambitions?
This plan and the initiatives it proposes deal with data that should, as the European Community expression goes, "be open as much as possible, closed as much as necessary". The term covers raw or reprocessed data in all formats, texts and documents, software, algorithms, protocols and so on. The CNRS is one of the major European producers of research data particularly through its involvement in very large instruments, observation systems and data infrastructures. The emergence of new technologies, increasing automation and, for example, the new analytical possibilities emerging from artificial intelligence all mean the volume and diversity of research data will significantly increase in the near future. This new plan is therefore a response to the current need to speed up the shift towards open science and make sure these data are preserved and reused.

However, the CNRS covers all disciplinary fields and these are at different stages of maturity for opening data. So the main ideas are to work with each community to disseminate best practices, to promote the existing services and tools created by more advanced communities like astrophysics, particle physics and the humanities and social sciences, and to support the creation of new practices, services and tools that respond to the requirements of communities that are less advanced with the subject.

How was this plan thought out?
The plan is based on the thought process that led to a white paper on data at the CNRS being published in January 2018 by our Computing–Data Mission (MICADO). A detailed analysis carried out in the CNRS's ten Institutes concluded it was important to promote a genuine 'data culture', develop a strong CNRS strategy in response to our communities' need for large-scale data analysis platforms, and to implement a policy for managing, exploiting and perpetuating data.

In July 2018 the Ministry of Higher Education, Research and Innovation (Mesri) published its National Plan for Open Science which aims to make "scientific research results open to all" with no obstacles or delay and free of charge. Following this, in November 2019 the CNRS adopted its Roadmap for Open Science which included a 'Research Data' chapter. This dedicated plan means we are really dealing with essentials now.

What are the proposed actions in practical terms?
We want to develop a clear proactive strategy and policy. Our Research Data Plan is above all driven by researchers' needs and also takes the full diversity of disciplinary contexts into account. Sometimes a real shift in culture is required so we need to change practices and attitudes while developing tools for research data management, sharing, long-term preservation and dissemination that comply with the FAIR principles.

To achieve this, we are encouraging scientists to deposit their data in open access in trusted repositories which the CNRS will keep a list of to help drive their certification. It should be possible to set up proprietary or embargo periods according to the discipline involved. We will also encourage our communities to re-use the data available in these repositories.

The CNRS will also set up a coordinated response to new requirements in terms of expertise, training, human resources and recognition – particularly in research assessment – and finally new transdisciplinary activities that support the FAIRisation and sharing of data. In particular, we have refocused the activities of the Institute for Scientific and Technical Information (Inist) on these issues, the aim being to make the Inist an essential pillar of the CNRS's open research data policy. This dedicated CNRS support and research unit is already supporting laboratories in developing the data management plans required for European contracts and helping communities structure their data and make them accessible.

The CNRS is also going to ramp up its involvement in national, European and international forums for the discussion of open science, computing and research data policies like the Research Data Alliance and the European Open Science Cloud.

The plan also implements new governance for research data at the CNRS.
Yes, to have a complete overview of open science as a whole, we're setting up a new functional Open Research Data Department (DDOR) attached to the CNRS Scientific Office. The DDOR will be tasked with proposing and supporting the implementation of open data policy and strategy at the CNRS. It was created from the merger of the former Scientific and Technical Information Department and the MICADO and covers the full continuum on the subject – from computing to scientific and technical information. As data covers all the issues to be dealt with, we chose to call it the Open Research Data Department as a reminder of our commitment to open science. This choice is somewhat avant-garde as we consider publications themselves to be research data.

Specific questions on the differentiation between open data and those that need to be protected will be dealt with by a unit made up of DDOR management, the CNRS Defence and Security Officer and representatives of the CNRS's Innovation Office, Data Protection Office and Security Department.

Finally, we'll need to align the CNRS's research data strategy and policy with those of its partners and of the Mesri. We will shortly appoint a data administrator to represent the CNRS in the network the Ministry is currently setting up.