ePIC Quality of Service and Policies

1. General Informationen

1.1. Goal of this Document

Transparency about the reliability of ePIC PIDs is the mayor goal of this document. This reliability is achieved by a couple of measures taken on an organizational as well as on a technical level, as it is described below.
For a PID infrastructure transparency about the uniqueness and persistency of the identifiers is of particular importance. ePIC PIDs are always unique, which is easy to achieve in the handle system, but even if persistency is part of the definition of PIDs, there are also cases where the deletion of PIDs might be necessary or useful, like the minting of PIDs for testing purposes. In order to make a clear distinction here, those ePIC PIDs that are not persistent in this sense, can be recognized always by their prefix, that has a capital ‘T’ after the first delimiter dot ‘.’ . Also transparency for instance about the question, whether the persistency of the identified digital object is guaranteed, is given by certain PID properties, as described in the chapter ‘Policies for PID Minting and Usage’.
Because the ePIC PID service and the global Handle services are of particular importance for the reliability of the ePIC and Handle infrastructure, the organizational and technical rules for providing these services are described explicitly below.
The implementation of this policy and SLA agreement standardizes the ePIC services along the following principles:

  • The processes between provider and client are described in an as unified manner as possible and differ only in rare concrete use cases.
  • In case of agreements between service provider and client the contract design is simplified, because only the exceptions from this policy and SLA agreement need to be specified explicitly.
  • In case of certificates given for PID services the service provider gets a clear understanding of the service level required for certification and the end user of these services get a unified view on service levels they can expect.

1.2. Scope

This Quality of Service and Policies (QoS&P) document describes general Service Level Agreements (SLA), quality of services, policies and workflows for the ePIC services, provided by ePIC members or providers of ePIC certified PID service externally to its customers and internally between ePIC members and providers of ePIC certified PID service.
ePIC members are all institutions that have signed the contract for full membership in ePIC; providers of ePIC certified PID service are external institutions that provide certified ePIC PID services. ePIC PID services should, but need not necessarily be certified ePIC PID services. Service provider for ePIC PID, that are not certified or not ePIC members are called external partners.
In general services provided by ePIC members or providers of ePIC certified PID service are PID services, services of the DONA MPA in the Handle system and other services. This document provides Quality of Service and Policies agreements for all of these kinds of services. Additionally, and in more detail the rules of operation for the DONA MPA in the Handle system are determined by the contract of the ePIC consortium, represented by GWDG, with DONA.
A provider of a service or service provider in this document is always that party, that is responsible for maintenance and availability of the service, and a client is that party, that uses the service. The concrete services, provided internally or externally by ePIC, are announced on the ePIC web pages. A first overview can be found in the Appendix: Services of ePIC in this document below.
This document defines

  • the minimal QoS to maintain by an ePIC member or provider of ePIC certified PID service,
  • the QoS and rules of operation that have to be fulfilled for a service to become and stay an ePIC service.
  • processes between provider and client with respect to the provisioning of the services as well as with respect to the quality of service that is offered by the provider to the client.
  • the rules for certification for ePIC PID services

Detailed descriptions are placed in an appendix in order to be able to adapt these descriptions by decisions of the ePIC technical board or ePIC management board if necessary.
This agreement about rules of operation, policy, QoS and SLA is an addendum to the ePIC Memorandum of Understanding and the contract for full membership in ePIC and part of the ePIC PID service certificate and will be made public to the ePIC customers. If it is in conflict with any of the regulations in the contract for full membership in ePIC, the regulations in the contract take precedence.

1.3. Goals

The implementation of this policy and SLA agreement realizes the following goals:
The processes between provider and client are described in an as unified manner as possible and differ only in rare concrete use cases.
In case of agreements between service provider and client the contract design is simplified, because only the exceptions from this policy and SLA agreement need to be specified explicitly.
In case of certificates given for PID services the service provider gets a clear understanding of the service level required for certification and the end user of these services get a unified view on service levels they can expect.
Because the ePIC PID service and the global Handle services are of particular importance for the reliability of the ePIC and Handle infrastructure, the rules for these services are described more explicitly below.

1.4. Acronyms

SLAService Level AgreementQoSQuality of ServiceHAHigh AvailabilityMPAMulti Primary AdministratorLHSLocal Handle ServerGHRGlobal Handle RegistryQoS&PQuality of Service and Policies

PID Persistent Identifier
SLA Service Level Agreement
QoS Quality of Service
HA High Availability
MPA Multi Primary Administrator
LHS Local Handle Server
GHR Global Handle Registry
QoS&P Quality of Service and Policies

2. Operation of Services

2.1. Conditions of Operation

As ePIC member or provider of ePIC certified PID service the provider operates the IT resources providing the services according to the general technical and legal rules that have to be applied in the country of the provider. The provider assures that only authorized persons have direct access to the physical IT resources, and that the IT resources are located in centers with state of the art infrastructure to guarantee uninterrupted service provisioning.

2.2. Privacy Protection and Security

In particular, the provider operates the IT resources providing the services according to the general rules of privacy protection and security, that have to be applied in the country of the provider, and as they are described in international regulations. The provider has the obligation to grant access to servers and databases of ePIC services only to those employees that are aware of the appropriate rules of privacy protection and security.

2.2.1. User Management and Responsibilities

Furthermore, the user management has to fulfil the demands for privacy protection and secrecy as above, because it contains private information
The ePIC PID service is a service for institutions, projects and scientists to maintain the resolution path of a PID. These institutions are external customers of the ePIC PID service provider. To be used by external clients these PID services need a user management that has to fulfil two purposes: to provide the access to the PID for the user and to provide the contact information of the user to the provider. Both, service provider as well as user, are responsible to inform each other about changes and necessary actions to be taken, to keep the user management up to date. The service provider has to provide the administrative and technical interfaces for the user to provide necessary information about changes.

3. Availability of Services

A service is available if it can be accessed by network communication and if it fulfills its functionality according to the service description. The provider operates the service continuously (24 hours at all days of the year) and assures the availability during the working hours of its institution, where public holidays of the institutions country are not part of the working hours. Maintenance and incidents are exceptions that have special regulations defined below. In order to distinguish between different levels of expected availability different classes of services can be specified. Currently the following classes with the following levels of availability are defined:

  • PID test, with no or insufficient PID replication
  • PID production, with sufficient PID replication for HA resolution
  • Handle infrastructure, HA infrastructure for services like MPA-GHR, Handle Proxy Server etc. (not LHS)
  • ePIC additional services, like monitoring, audits, web and portal pages, software repository etc.

3.1. High Availability and Replication of Services

3.1.1. Productive PID Services and LHS Replication Rules

In order to provide a high quality of service for the PID resolution all PID services of availability class ‘PID production’ have to have at least two mirror LHS, where at least one has to be located outside the local network of the PID service.
Each ePIC member or provider of ePIC certified PID service has to provide at least as much mirror LHS or options for mirror LHS to other ePIC members or external ePIC PID service provider as it uses mirror LHS from other ePIC members or external ePIC PID service provider for its own services.

3.1.2. Handle Infrastructure

The rules of operation for the non LHS Handle infrastructure are essentially determined by the rules and the contract of the ePIC consortium, represented by GWDG, with DONA. Since these services build the backbone of the Handle infrastructure, they have to be integrated in a high availability infrastructure, and should be mirrored if the service allows this.

3.1.3. ePIC Additional Services

In general, all of those services with a direct impact on services used by ePIC customers should have a high level of availability. But since the roles and dependences of these services differ, there are no unified rules of operation for these services defined in this document, but the ePIC Technical Board determines which of these services have to be operated with which service level.

4. Monitoring, Accounting

4.1. Monitoring

On behalf of ePIC one or more ePIC members run monitoring services for the ePIC services.
All productive PID services, Handle infrastructure services and most of the ePIC additional services have to be monitored by the central ePIC monitoring system. This includes also all certified ePIC PID services run by external partners and mirror LHS that are run by ePIC members for external institutions. Also PID services of external partners can be monitored by the central ePIC monitoring service.
The ePIC monitoring generates incident statements that contain as much information about the incident as available for the monitoring system, in order to reduce the administrative work of the provider for the incident. These incident statements generate tickets in the support system and are sent to the responsible provider of the effected service.
The provider of ePIC PID services has to implement the necessary monitoring sensors and indicators in its PID service, including the “EPIC_HEALTHCHECK” PID and other possibly needed indicators, defined by the technical board of ePIC, and provide all relevant information for a qualified incident statement to the monitoring system. Especially the provider of a PID service always has to have a reliable contact point, such that incidents are recognized and the responsible provider can start with its recovery as soon as possible.

4.2. Accounting

On behalf of ePIC one ePIC member runs a central and public accounting service of the number of created PIDs for the ePIC PID services and provides a sensor that enables an accounting of these numbers for a PID service. The ePIC members and providers of ePIC certified PID service have to provide all the necessary information to the central accounting service. The sensor as well as the central accounting service has to agree with the rules of privacy protection and security as specified above.

5. Incident and Support Management

5.1. Incident Management

5.1.1. Announcement of Incidents and its Recovery

An incident is an unplanned failure, interruption or serious quality reduction of a service. An incident can be detected by the ePIC monitoring, by internal monitoring tools of an ePIC member or provider of ePIC certified PID service or by a ticket or any other kind of announcement of an external partner or user of the services.
Each ePIC member, provider of ePIC certified PID service and external partner, that uses the central ePIC monitoring system to monitor their services, has to provide reliable contact points in order to get informed about the incident and to start with the failure recovery as soon as possible.
In case of an incident the responsible service provider has to generate a qualified failure announcement that contains the available information about the expected seriousness (incident class, see Appendix), about the error diagnosis and about the expected down time of the service, that currently will be announced at
https://projects.gwdg.de/projects/epic-infrastructure/wiki/Incidents
The failure announcement should contain the following fields:

  • date / time
  • prefix(es)
  • server
  • short incident description
  • affected services
  • incident class (see ‘Incident Classes’ below)
  • expected down time

The service provider has to take all necessary actions for failure recovery, to document this and to update the failure announcement accordingly.
After the recovery of the incident the provider has to generate a final report that explains the failure and, instead of the expected down time, should contain the following additional fields:

  • actual down time(h:m)
  • actions taken
  • date / time of solution

5.1.2. Incident Classes

There are currently four incident classes defined, describing the severeness of the incident and special conditions of severeness are defined in the important case of PID services:
LOW: parts of the functionality are affected, but services are still available
MEDIUM: parts of the functionality are affected and services are only with restrictions available
HIGH: main parts of the functionality are affected and services are only with mayor restrictions available
CRITICAL: the system is down and the services are not available
In Appendix 10.4 is described, which failure of an ePIC service belongs to which incident class.

5.2. Maintenance

The maintenance of ePIC services with an impact on the availability of a service has to be performed at a weekly maintenance window that is specifically determined and announced for each ePIC member or provider of ePIC certified PID service. The maintenance of services has to be announced at least five working days in advance. Maintenances of services and mirror services belonging to these services have to be planned at different times. The contact point for incidents has to be available during the maintenance or other contact information has to be given with the announcement.

5.3. Support of Services

5.3.1. The Support System of ePIC

As a support channel ePIC uses the email address support@pidconsortium.eu .
This support channel refers to a ticketing system with a set of queues. Each queue has its own specification and responsibilities. There is a special ‘general’ queue, where all tickets will go if they do not have another assignment. The queues and their detailed descriptions are under review and can be found below in the Appendix: Queues of the Ticket System. If necessary, these queues and its related responsibilities will be adapted by the ePIC management board to adhere to the needs of an efficient and responsive support service.

5.3.2. Obligations for Support Requests of ePIC Members

All ePIC members are responsible especially for the ‘general’ queue and for all other queues, which are not dedicated to specific members or partners.
A first contact to the customer of the ticket has to be established generally latest after four working hours. After three working days a ticket, that has not been answered or is set on hold, will be escalated. The queue belonging and the responsibility for a ticket has to be clear within these four working hours, latest after three working days. A ticket, that has an incorrect queue assignment, can be rescheduled to the ‘general’ queue with a short comment on the reason and a suggestion for a better assignment, if applicable.

5.3.3. The Queue Assignment

The ePIC member that is responsible for a certain time period (currently one month) to define the queue assignment of the tickets and that defines the ticket responsibility in the case of uncertainties is determined by the ePIC technical board.

6. Rules for Prefix Assignment

The prefix assignment is done by the ePIC DONA MPA, according to the regulations of DONA. As already determined in the Contract for full membership in ePIC, each ePIC member is able to get prefixes in a certain range of prefixes under the prefix 21 from the ePIC DONA MPA. The assignment of these prefix ranges to the ePIC members is done by the Management Board.
ePIC member can deliver or sell prefixes of their range, that are assigned by the ePIC DONA MPA, to external institutions, but they are not allowed to sell the right to assign prefixes from parts of their prefix range to third parties.
The prefix request to the ePIC DONA MPA is done via a ticket to the ePIC support system and has to contain all relevant information to assign the prefix including the contact and user information of the responsible institution in accordance to the Transparency Agreement of the Contract for full membership in ePIC.
As already determined in the Contract for full membership in ePIC, an ePIC member is free in its choice of technical services for maintaining its prefixes and may also provide additional services. A member may either provide the technical services to maintain its prefixes directly or have a requesting entity maintain it.

7. Policies for PID Minting and Update

All policies for minting and update are rules, which are given inside the namespace of a specific prefix. In general, the maintainer of a prefix has a great amount of freedom in the decision about which policies have to be applied to the PID minted under the prefix. But there are some topics related to interoperability and reliability of the service, and to the reputation of the ePIC PID services in its whole, where certain restrictions are defined. Some of these restrictions are mandatory for all, others mandatory only for production prefixes and others recommended.
Since the following policies become effective together with this document, earlier minted PIDs cannot be affected due to their persistency. These policies furthermore are requirements, that in general are not technically enforced, but have to be fulfilled to become ePIC compliant.

7.1. Character Set of Suffixes

For character sets a wide range of standards is available. The Handle system allows in general UTF-8 encoding for Handle PIDs. In the following several mandatory and recommended restrictions for ePIC PIDs are discussed.

7.1.1. ASCII numbering

In order to avoid confusion by using different character encoding it is necessary to restrict the character set for PID suffixes to printable ASCII characters, because identical characters can have different meaning in a different coding. This policy is mandatory for ePIC PIDs and is called ASCII numbering.

7.1.2. DOI numbering compatibility

The International DOI Foundation has different and in general more restricted rules for the syntax of names used for DOIs then for usual Handles and less restricted rules then the ePIC PID ASCII numbering.
The restrictions and changements of DOI with respect to the Handle policies are published at
http://www.doi.org/DOI_handbook/2_Numbering/
The most important differences are especially the case insensitivity of the PID strings and the different character code which is Unicode-2 in contrast to UTF-8 for the Handle PID in general and printable ASCII for ePIC PID.
Restricted to ASCII the character sets for Unicode-2 and UTF-8 coincide For compatibility of ePIC PID with DOI handles it is therefore sufficient to further restrict the ePIC PID numbering scheme to printable upper case ASCII (or alternatively printable lower case ASCII) characters. A less restrictive but more difficult to implement rule would be, to guarantee that no two ASCII PID numbers map to the same string by the upper case mapping. To achieve DOI compatibility both ways are possible and allowed because DOI numbering strings are always mapped to upper case.
This policy is recommended for ePIC PID and is called DOI numbering compatibility.

7.1.3. Encoding Issues for PID used in URL

Characters that are not allowed or have another meaning in URL or URN must use hexadecimal (%) encoding, because PID are strings used in URL for resolution. This policy is mandatory.

7.2. The Uniqueness of PID

ePIC PIDs are always unique. This policy is mandatory.
Remark: for ePIC PIDs there is currently no policy in place that maps different suffix strings to the same ePIC PID name, like for instance a rule of case insensitivity would do. But each such policy has to fulfill the PID uniqueness rule.
On the other hand different PID may refer to the same resource location.

7.3. The Deletion of PID

7.3.1. Production Prefixes

All ePIC prefixes are production prefixes, except they are of the form 21.Txyz, where xyz can be any allowed string.

7.3.2. Persistency Rule

ePIC PIDs that are minted with a production prefix cannot be deleted. This policy is called persistency rule and is mandatory for ePIC PIDs of a production prefix.

7.3.3. Prefixes of the form 21.Txyz with Non-Persistent PID

For example for testing purposes the deletion of PID might be allowed and necessary. In this case the possibility to delete a PID is a property of the whole prefix, which can be clearly distinguished from the ePIC PIDs for production by the leading letter ‘T’ after the first delimiter. This policy is mandatory.

7.4. The Reference to the Digital Object

For ePIC in general the deletion or unreachability of the digital object, to which the PID is referring to, is possible.

7.4.1. Deletion of Digital Objects

For ePIC in general the deletion or unreachability of the digital object, to which the PID is referring to, is possible. But ePIC PIDs, referring to a deleted object, that are minted with a production prefix are persistently available by the persistency rule above and need an expiration date that points to a date in the past (for instance by the use of an appropriate date type, registered at a data type registry). These PIDs still provide the metadata in the PID info types of the former digital object as a permanent reference. This policy is mandatory.

7.4.2. Unreachable Digital Objects

Unreachable digital objects require the specification of a reason of the unaccessability, like a description of an embargo period or other access restritions (for instance by the use of an appropriate date type, registered at a data type registry) . This policy is mandatory.

7.4.3. Update Rules for PID

The maintainer of the PID is responsible for the update of the PID reference in case of a permanent failure of resolvability of the PID. ePIC members and certified ePIC PID providers have to ensure for a production prefix, that all maintainers of PIDs of the given prefix can be contacted and take the responsibility for an update of the PID if necessary.

7.5. Migration of PID and Prefixes in the Case of Institutional Failure

In the case that an ePIC PID service provider becomes unable to further provide the service for whatever reason, the ePIC consortium applies the regulations about migration of the PID service to another partner, as it is described in the contract of full membership of ePIC.

7.6. The use of Types and Templates

At the moment this section only contains suggestions and will be made explicit later.

7.6.1. Template Delimiter

Restriction on the delimiter used in the template rules. Issues might occur for instance with ‘@’.

7.6.2. Template Services

In many cases templates are used to refer to services available to the digital object referred to. It is recommended to ensure a long term perspective for these services.

7.6.3. Types

All types used in PIDs of an ePIC prefix have to be registered at a data type registry, after such a service becomes available and an appropriate type is registered there. All types used, which have no appropriate registered type, should be registered as soon as and if possible.

8. Certification of ePIC PID Services

 

8.1. An ePIC service has to fulfil the rules as they are defined in this QoS&P document. An external institution can ask for certification of the provided PID service to be an ePIC PID Service.

  • To be a certified ePIC PID Service the service has to fulfil the service requirements and policies of this document, in particular the following rules of operation and policies have to be proven:

8.2. Operation of Services

  • The IT resources for the PID service are located in centers with state of the art infrastructure to guarantee uninterrupted service provisioning.

8.3. Availability of Services

  • The PID service for a production prefix has to have at least two mirror LHS, where at least one has to be located outside the local network of the PID service.
  • The PID service provider has to provide at least as much mirror LHS or options for mirror LHS for other ePIC members or providers of ePIC certified PID service as it uses mirror LHS from other ePIC members or providers of ePIC certified PID service for its own services.

8.4. Monitoring, Accounting

  • The PID service has all necessary monitoring sensors running and is part of the ePIC monitoring system.
  • The PID service provider has to provide all the necessary information to the central accounting service.
  • At least one of the mirror LHS has to be monitored also by the ePIC monitoring system.

8.5. Incident and Support Management

  • The PID service provider has to fulfil the rules for incident management, incidents recovery and service maintenance, as well as all other rules for QoS as they are described in this document.
  • The PID service provider is responsible to answer tickets in its ‘LHS xxyyy within EPIC’ queue in the ePIC support system.

8.6. Rules for Prefix Assignment

  • The prefix is assigned by the ePIC DONA MPA or in case of legacy prefixes by CNRI (Corporation for National Research Initiatives, see http://www.cnri.reston.va.us/).

8.7. Policies for PID Minting and Update

  • The PID service has to fulfill all mandatory policies of ePIC for PID minting and update.
  • The PID service provider has to ensure that all maintainers of PIDs of the prefix can be contacted to take the responsibility for an update of a PID if necessary.

9. Appendixes

9.1. Appendix: Queues of the Ticket System

 

Currently the ePIC support system provides the following queues:

Name Description Responsibility
prefix request tickets that ask for a new prefix DONA-MPA (currently GWDG on behalf of ePIC)
PID services and requests tickets that ask for PID services and single PID all ePIC members
errors und RFCs tickets related to errors or updates of the ePIC software all ePIC members
LHS xxyyy within EPIC tickets that are directly coupled to a specific prefix (xxyyy stands for the prefix) and LHS ePIC member or provider of ePIC certified PID service that provides the specific prefix and LHS
general all tickets that do not fit into the other queues all ePIC members

10.2. Appendix: Services of ePIC

Currently ePIC provides or will provide the following internal or external services (HA: High Availability Service, Mon.: Service needs to be monitored):

  • ePIC PID services (HA, Mon.)
  • Handle infrastructure (HA, Mon.)
  • ePIC additional services
    • Support System of ePIC (HA, Mon.)
    • ePIC web pages (HA, Mon.)
    • ePIC documentation (HA?)
    • ePIC monitoring service (HA)
    • ePIC accounting service
    • ePIC audit service
    • ePIC DONA MPA prefix service (Mon.)
    • ePIC PID service certification service
    • ePIC Data Type Registry (?)

9.3. Appendix: Incident Classes of ePIC Services

Incidents of ePIC services belong to the following incident classes:
LOW:

  • minting and resolution of PIDs of a prefix are not disturbed
  • minting of PIDs of a prefix disturbed

MEDIUM:

  • no minting of PIDs of a prefix possible
  • resolution of PIDs of a prefix disturbed
  • no generation of a new prefix possible

HIGH:

  • resolution of PIDs of a prefix disturbed
  • minting and resolution of PIDs of a prefix disturbed

CRITICAL:

  • no resolution of PIDs of a prefix possible
  • neither resolution nor minting PIDs of a prefix possible
  • no resolution of PIDs of a couple of prefixes possible