The FAIR principles (Findable, Accessible, Interoperable, Reusable) describe how data should be organised to be more easily accessible, understood, exchangeable and reusable. They are mainly used for research data, but the principles apply to any open access digital resource related to a scientific activity.
The increasing availability of these online resources implies that the platforms that host them implement protocols and standards so that, today as tomorrow, humans and machines can exploit them. For the HAL open archive as well as for Episciences, the CCSD has been working since their creation so that the publications and the metadata that describe them fully comply with the guiding principles of open science.
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services.
- each deposited file is described with rich metadata (bibliographic metadata, author’s affiliation, ANR or European projects metadata)
- metadata of each deposit are assigned a HAL identifier which is unique and persistent ; metadata of articles published in an Episciences journal are assigned a DOI
- metadata are indexed in a searchable resource
- the access is open and free
Standards and protocols: URI, Dublin Core, TEI, RDF, Datacite (for Episciences)
This principle encourages long-term preservation and easy access to data:
- metadata are accessible via open standards and protocols
- metadata are accessible via open APIs (no prior registration), OAI-PMH and in a triplestore
- the contents of the documents are available in open and free access
- data are stored in a secure environment (IN2P3 Computing Center) and accessible via open protocols
- documents are sent to the CINES to preserve their long term accessibility and readability
Standards and protocols: OAI-PMH, API, RDF Triplestore
This principle encourages open, widely shared languages and formats, which enable exchanges between computer systems and increase the capacity of metadata to be combined:
- use of identifiers : DOI, PMID, SWHid, arxivid
- alignment with idRef, ORCID, RNSR
- vocabularies : DC, RDF, FOAF, SKOS, BILBO, Fabio
This principle asserts the need for metadata that provides information about the origins of the data and the conditions for its re-use:
- a distribution license can be added for the publications deposited in HAL
- Episciences metadata are accessible under a CC0 license
Episciences : fulltext files are hosted on repository chosen by authors to submit their preprint. It can be on HAL, but also on arXiv or Zenodo.