Is your ANR project mentioned in the submitted file? The functionality for extracting metadata from a PDF file has now been extended to include funding information, allowing the deposit form to be completed automatically. Initially available for ANR projects, this functionality will soon be extended to European projects.

Funding or Acknowledgement sections are one of the elements that structure the presentation of research in a scientific publication: they provide readers with information about the funding of the research and allow funders to track the work. They include the name of the funder, a name or a project number. This coded information can be automatically processed by algorithms, which is what HAL has been doing since 5 March for projects funded by the French National Research Agency (ANR).

Objective: to make easier the deposit…

The first step to automatically add an ANR project to the form is to extract the information from the submitted pdf file. The application, which already extracts authors, titles, abstracts, journal titles, etc., has been enriched with funding metadata.

HAL then processes the retrieved information by checking it against the auréHAL reference data. If a match is found, the form is completed. However, the depositor is invited to check just as for the other automatically extracted metadata.

… with reliable and high quality metadata

However, the proposed metadata is consolidated.

First, the extracted information is checked against the funder repository managed by CrossRef (Open Funder Registry), which assigns DOIs to these organisations.

Secondly, the auréHAL reference data for ANR projects come from the datasets deposited by the agency on the national platform

By using this official data source, HAL guarantees the reliability and quality of the data. As a result, the deposit is well referenced in the ANR’s HAL portal, which aims to facilitate access to all scientific publications resulting from projects funded by the agency.

This development of  HAL is part of the Equipex+ HALiance project, and more specifically of Work Package 3, which aims to recover metadata and identifiers from the files submitted and automatically enrich the HAL database. The CCSD is relying on a collaboration with the company Science-Miner, which develops open source tools for exploring scientific texts. The next step for HAL is to extend the service to European projects.

