NUSL Technical Solution
The basic elements of the NUSL software solution comprise Invenio for the
NUSL digital repository and the
Elasticsearch indexing and search system for the NUSL central search
interface. The same solution architecture
has been successfully run in the Swiss CERN for several years. Individual
activities and cooperation of the digital repository in Invenio and the
Elasticsearch indexing and search system are depicted in the figure.
NUSL central search interface in the Elasticsearch
The NUSL central search interface is aimed
to create an integrating search platform of grey literature repositories. This
integrating function used to be ensured by the ESP FAST indexing and search
system, but it was replaced with Elasticsearch system in 2016. Elasticsearch
provides a secure, relevant and scalable search in linked repositories. This
solution should allow users to access the data from both the digital
repository and the selected grey literature repositories in a single
interactive environment. The search is primarily performed according to
navigations including document types, authors, keywords, linked bases and also
NUSL digital repository in the Invenio system
Invenio belongs to Open Source software. It
may be freely installed, used and modified, which enables its setup for
storing grey literature and its distribution among partner organizations. In
2010 the system was debugged on the basis of continuous testing of system
operation, introduction of data into the system and harvesting data from
partner repositories. The Invenio system was modified in all parts, from the
format structure over templates, and setup of collections to the search setup,
etc. At the same time, the digital
repository was designed graphically, it was
fully localized into Czech language and the record search was adjusted.
The software solution for the NUSL project was selected on the basis of a
public tender that took place in 2009. The software functionality requirements
were defined in such a way as to include the requirements necessary for pilot
implementation of the system as well as to help choose a modern, well-
supported technology with good developmental prospects. Software functionality
requirements may be found in the
The preparation for the selection of the software solution included an analysis of selected Open Source software for digital libraries. The following Open Source software was analyzed: DSpace, Fedora, CDS Invenio, Eprint, and Greenstone. The results of the analysis may be found in the repository.
A format for storing metadata is an essential part of the construction of
repositories. An individual metadata format was defined for the needs of the
National Repository of Grey Literature (NUSL). The NUSL metadata format was
designed especially for processing records on grey digital documents. The basic requirements for the NUSL format are the maximum simplicity and
compatibility with the Dublin Core standard. The NUSL metadata format uses
elements of Dublin Core, Dublin Core Terms, EVSKP-MS, ETD-MS and some
The first draft of an individual NUSL metadata format version 0.1 was defined in 2008 and in 2009 it was tested on own data in the NTK and at the University of Economics in Prague. The results of testing and expertise were included into the beta version of 0.2 of the NUSL metadata format. In 2010 the metadata format was optimized using practical experience with the introduction of metadata and full texts into the repository, with harvesting of metadata and files with full texts from partner organizations and with requirements on compliance with the OpeGrey system. In this way, the verified version 1.0 of the NUSL metadata format originated, may be found in the repository (in Czech only).
The implementation of the NUSL metadata format into the selected sofware solution Invenio, which uses MARC-21 native format, was accompanied by the creation of a conversion table.
The primary purpose of digital archives is to store digital information and
make it accessible. Persistent identifiers ensure the permanent access to
digital documents. Here, persistence of the identifier means the permanence of
identification irrespective of the permanence of the identified document.
Therefore, it is important that the source marked by a persistent identifier
is never relocated or liquidated unless the information on its location is
updated in the persistent identification registry. The solution concerning the
use of persistent identifiers is described
It was intended to use persistent identifier like URN:NBN, Handle etc. Unfortunately, there is currently no working URN:NBN resolver for grey literature in the Czech Republic. As a solution, URI identifier is generated in Invenio system in this format:
www.nusl.cz/ntk/nusl-ID. Identifier nusl-
ID represents the number assignet to the record by the Invenio.
The defining criteria to select the persistent identifier for the NUSL are defined in the repository. Resources used in this work are cited in another document connected to the same record.