EPIC provides PID Services for the European Research Community
EPIC was founded in 2009 by a consortium of
European partners in order to provide
PID Services for the European Research
Community, based on the handle
(TM, http://www.handle.net/ ),
for the allocation and resolution of persistent identifiers. The consortium
signed a Memorandum of Understanding
aiming to provide long term reliability
for the PID services,
The purpose of persistent identifiers (PID)
In all areas of science the amount of stored data grows rapidly and more and more relations between these data and other resources become essential for science as for instance references to scientific publications. It turns out that scientific institutions need to develop a strategy for the long term preservation of their scientific resources, in order to ensure its long-lasting accessability.
In the scientific community it is therefore increasingly necessary that the resources are registered in well-kept repositories with a content, that is never changing and which can be referenced and cited this way. Furthermore these references itsself have to be stabil whereas the underlying repositories are more like "living organisms" with an often migration on various levels like changes in hardware, software, physical place or format. Because of these alterations the currently often used URLs with its physical pathes and semantical contents, which are frequently outdated after couple of years, are not suitable any more,
Sciences needs new methods to reference the primary and secondary scientific data in order to name these data in a unique and timeless way like the ISBN numbers for books, which are permanent and citeable references to the related books. For the resolution of such unique and persistent identifiers (PID) one needs a commonly agreed process and due to the importance of the resolution of the references to actual URLs for a lot of transactions, the needed resolution service has to have a high degree of robustness and reliability in the long-term.
The Data Creation CycleAs the result of increasingly powerful sensors, as result of worldwide crowdsourcing activities and as result of computer simulations, data objects are being created in continuously larger numbers. They are being enriched as result of scientific activities - be it by manual inter vention or by computer algorithms. Almost all of these data objects have a life-cycle, i.e. they are being created, validated, used, re-used, modified, moved and copied for various reasons, gathered into complex collections, etc. Data objects are annotated by content information, documentation and provenance information.
For many reasons it is so important to keep track of these data objects together with their annotations, to verify their integrity and authenticity and to quickly see the context of creation and the provenance. With associating identities with each data object and each collection early at the beginning of their lifecycle and certainly at the moment where scientific publication refer to data collections we improve data life cycle management and access to data for true scientific purposes.
What PID for what Purpose?As said above EPIC is an identifier system using the Handle infrastructure. Its focus is the registration of data in an early state of the scientific process, where lots of data is generated and has to become referable to collaborate with other scientific groups or communities, but it is still unclear, which small part of the data should be availible for a long time period. Even in the case, that some data has to be cited, an EPIC PID is reliable, because its resolution is guaranteed.
Another identifier system using the Handle infrastructure. widely used by the publishing industry for the persistent identification of journal articles and for tracking cross-linking through citations, is the Digital Object Identifier (DOI). If a later registration with a DOI is wanted for some reason, the PIDs easily can be tranfered because the identifier systems use the same underlying handle software and similar database schemata.
Why is Identifier Resolution Important?Resolution systems enable client software to go from an identifier to current state information about the identi- fied object, such as where and how to access the object. Such identifiers can persist over changes to the identified object, such as changes in its location(s), ownership, and other attributes, persistence that is vital for maintaining data integrity over time.
What is the Handle System?The Handle System is a general-purpose identifier resoution system that has been in place for many years. Identifiers in the Handle System are made up of a prefix and a suffix separated by a slash, e.g., 10.1594/ PANGAEA.667386 or 4263537/5030. The prefix is used by client software to find the specific servers within the widely distributed resolution system that will be able to resolve the identifier. Existing identifiers can be easily structured as handles in order to take advantage of the resolution system.
Who Manages the Handle System?CNRI designed, implemented, and currently administers the root level of the system, but the bulk of the resolution services are managed by the thousands of organizations, communities, government agencies, and businesses around the world currently using the system on a daily basis. This includes many academic and national libraries, scholarly journal publishers, scientific institutes, and other information management groups. CNRI is working with other major user groups, including EPIC and DataCite to create DONA (Digital Object Numbering Authority) to manage the Handle System into the future. The new organization will be governed by the DONA Board, which will include experts and stakeholders from around the world.