Abstract

This document describes the background and methodology for the design of the DataCite profile of DCAT-AP (CiteDCAT-AP), as well as the defined mappings and RDF vocabulary.

Introduction

This document describes the background and methodology for the design of the DataCite profile of DCAT-AP (CiteDCAT-AP), as well as the defined mappings.

The motivation for investigating the possiblity of aligning DataCite metadata with DCAT-AP is twofold:

  1. To identify how to create a DCAT-AP-compliant representation of DataCite metadata, in order to enable their sharing across DCAT-AP-enabled data catalogues. This analysis is not meant to provide a complete representation of all DataCite metadata elements, but only of those included in DCAT-AP.
  2. To identify how to create a DataCite-compliant representation of DCAT-AP metadata, in order to enable their publishing on the DataCite infrastructure. This analysis is meant to develop an extension of DCAT-AP, covering all DataCite metadata elements.

The following sections first illustrate the background () and the methodology () for the design of CiteDCAT-AP, followed by an analysis of related work (). Then, a high level overview of the metadata elements supported in DCAT-AP and DataCite is provided (), along with a summary of mapping issues ().

The defined mappings are introduced in , and grouped as follows:

The terms (classes and properties) defined in the CiteDCAT-AP namespace and the XSLT-based implementation of the defined mappings is included in appendix ( and , respectively).

Background

The DCAT Application Profile for Data Portals in Europe (DCAT-AP)

DCAT-AP [[DCAT-AP]] is a metadata profile developed in the framework of the EU Programme Interoperability Solutions for European Public Administrations (ISA), and based on and compliant with the W3C Data Catalog vocabulary (DCAT) [[VOCAB-DCAT]] - currently, one of the most widely used Semantic Web vocabularies for describing datasets and data catalogues.

The purpose of DCAT-AP is to define a common interchange metadata format for data portals of the EU and of EU Member States. In order to achieve this, DCAT-AP defines a set of classes and properties, grouped into mandatory, recommended and optional. Such classes and properties correspond to information on datasets and data catalogues that are shared by many European data portals, aiding interoperability. Although DCAT-AP is designed to be independent from its actual implementation, RDF [[RDF-CONCEPTS]] and Linked Data [[LD-BOOK]] are the reference technologies.

DataCite

DataCite is an international initiative meant to enable citation for scientific datasets. To achieve this, DataCite operates a metadata infrastructure, following the same approach used by CrossRef for scientific publications. As such, the DataCite infrastructure is responsible for issuing persistent identifiers (in particular, DOIs) for datasets, and for registering dataset metadata. Such metadata are to be provided according to the DataCite metadata schema [[DataCite]] - which is basically an extension to the DOI one.

Currently, DataCite is the de facto standard for data citation. Therefore, the ability to transform metadata records from and to the DataCite metadata schema would enable, respectively, the harvesting of DataCite records, and the publication of metadata records in the DataCite infrastructure (thus enabling their citation).

Aligning DataCite with DCAT-AP

The motivation for investigating the possiblity of aligning DataCite metadata with DCAT-AP is twofold:

  1. To identify how to create a DCAT-AP-compliant representation of DataCite metadata, in order to enable their sharing across DCAT-AP-enabled data catalogues. This analysis is not meant to provide a complete representation of all DataCite metadata elements, but only of those included in DCAT-AP.
  2. To identify how to create a DataCite-compliant representation of DCAT-AP metadata, in order to enable their publishing on the DataCite infrastructure. This analysis is meant to develop an extension of DCAT-AP, covering all DataCite metadata elements.

About point (2), the DataCite-based extension of DCAT-AP is also meant to integrate into DCAT-AP all the information required for data citation.

Based on these considerations, two versions of a DataCite profile of DCAT-AP have been defined, namely, CiteDCAT-AP Core (addressing the requirements of point (1)) and CiteDCAT-AP Extended (addressing the requirements of point (2)). More precisely, the core version includes alignments only for the subset of DataCite metadata elements included in the DCAT-AP specification, whereas the extended version tries to defines alignments for all the DataCite metadata elements using DCAT-AP and other Semantic Web vocabularies (whenever DCAT-AP does not offer suitable candidates). As such, CiteDCAT-AP Extended is a superset of CiteDCAT-AP Core, and both are conformant with DCAT-AP.

Methodology

The reference DCAT-AP and DataCite specifications on which CiteDCAT-AP is based are the following ones:

For the mappings, existing work has been taken into account concerning the mapping of DataCite to other metadata standards. In particular:

CiteDCAT-AP builds upon these specifications to provide an as much as possible complete mapping of all the metadata elements in version 4.4 of the DataCite metadata schema [[DataCite-20210330]]. Moreover, the defined mappings are backward compatible with earlier versions of the DataCite metadata schema.

The resulting mappings have been grouped into two classes, corresponding to two different CiteDCAT-AP profiles:

As far as the extended profile is concerned, the reference vocabularies have been chosen based on the following criteria:

  1. They have clear persistence and versioning policies.
  2. Preferably, they should be used across domains and data communities.

These criteria are determining the main differences with the mappings defined in [[?DC2AP]] and [[?DataCite2RDF]], that are illustrated in the following section.

Comparison with DC2AP & DataCite2RDF

[[DC2AP]] and [[DataCite2RDF]] provide a full mapping of version 3.1 of the DataCite metadata schema. To achieve this, they re-use a number of vocabularies, that can be grouped into two main classes:

  1. General purpose and widely used vocabularies, such as Dublin Core, FOAF, GeoSPARQL, and SKOS.
  2. A set of vocabularies developed specifically to model the publishing and academic domains. They include PRISM (Publishing Requirements for Industry Standard Metadata) [[PRISM]] and FRBR (Functional Requirements for Bibliographic Records) [[FRBR]], plus a set of ontologies developed in the framework of the SPAR (Semantic Publishing and Referencing Ontologies) project.

The current version of CiteDCAT-AP follows the mappings based on the former group of vocabularies, but not the ones based on the latter group. The reason is twofold. First, the persistence and versioning policies of these vocabularies are unclear, and so it has been considered safer to re-consider their use when CiteDCAT-AP is more consolidated. Second, the mappings defined in [[DC2AP]] and [[DataCite2RDF]] are not compliant with the general requirements of DCAT-AP. For instance, despite the resource types defined in DataCite include datasets and metadata records, neither [[DC2AP]] nor [[DataCite2RDF]] make use of DCAT.

It is worth noting that the second group of ontologies, and in particular the ones developed in the SPAR project, provide interesting solutions to modelling some aspects not explicitly addressed in DCAT-AP - e.g., the possibility of associating a temporal dimension to agent roles - but also alternative solutions for specifying the same information - one of the examples being resource identifiers. Complementing and aligning the different approaches would be mutually beneficial.

DataCite and DCAT-AP at a glance

The following sections provide a high-level comparison of the metadata elements defined in DataCite and DCAT-AP.

DataCite metadata elements supported in DCAT-AP

The following table provides the complete list of DataCite metadata elements, and shows whether they are supported in DCAT-AP.

For each of the DataCite metadata elements, the table specifies whether they are mandatory (M), recommended (R), or optional (O).

DataCite DCAT-AP Comments
Elements Obligation
Identifier M Partially DataCite requires this to be a DOI, whereas DCAT-AP does not have such requirement
Creator M Yes
Title M Yes
Publisher M Yes
Publication year M Yes
Subject R Yes
Contributor R Partially

DCAT-AP supports only 1 out of the 21 DataCite contributor types (namely, contact point / person).

GeoDCAT-AP supports 1 additional DataCite contributor type, namely, rights holder, plus a set of roles from [[?ISO-19115]].

Date R Partially

DCAT-AP supports only 2 out of the 9 DataCite date types (namely, issue date and last modified date)

GeoDCAT-AP supports also an additional date type, namely, creation date.

Resource type R Partially

DCAT-AP supports 2 resource types, namely, dcat:Dataset and dcat:DataService. All DataCite resource types fall under the definition of dcat:Dataset, with the exception of event, physical object, and service (dcat:DataService has a more specific semantics compared with the service resource type in DataCite).

Related identifier R Yes
Description R Yes
Geolocation R Yes
Language O Yes
Alternate identifier O Yes
Size O Yes In DCAT-AP, this is a property of the dataset distribution, and not of the dataset itself
Format O Yes In DCAT-AP, this is a property of the dataset distribution, and not of the dataset itself
Version O Yes
Rights O Yes

DataCite does not use specific elements for use conditions (i.e., licences) and access rights,

In DCAT-AP, use conditions are a property of the dataset distribution, whereas access rights are associated with the dataset.

Funding Reference O No This element specifies: (a) title, identifier and, possibly, URI of the funding project, and (b) name and identifier of the organisation who awarded that project

DCAT-AP classes and properties supported in DataCite

The following table provides the list of classes and properties defined in DCAT-AP, and shows whether they are supported in DataCite.

NB: The list of DCAT-AP classes and properties is here limited to those that are either mandatory (M) or recommended (R).

DCAT-AP DataCite Comments
Classes Obligation Properties Obligation
Agent M name M Yes
type R No
Catalogue M dataset M No
description M No
publisher M No
title M No
homepage R No
language R No
licence R No
release date R No
themes R No
spatial / geographic coverage R No
update / modification date R No
Dataset M description M Yes In DataCite, this property is recommended, not mandatory
title M Yes
contact point R Yes
dataset distribution R No
keyword / tag R Yes
publisher R Yes
spatial / geographic coverage R Yes
temporal coverage R No
theme / category R Yes
Category R preferred label M Yes
Category scheme R title M Yes
Distribution R accessURL M No
availability R No
description R No
format R Yes In DataCite, this is always a property of the resource itself - even when such resource is a dataset
licence R Yes In DataCite, this is always a property of the resource itself - even when such resource is a dataset
Licence document R type R No

Summary of alignment issues

As shown in the previous section, DCAT-AP is able to represent all DataCite mandatory elements.

On the other hand, DataCite includes all the DCAT-AP mandatory classes and related properties, with the only notable exception of dcat:Catalog. However, this does not pose particular compliance issues, since the catalogue description could be obtained separately from the relevant DataCite records. Actually, since DataCite records are supposed to be all available via the DataCite catalogue, the catalogue description can potentially be the same for all DataCite records. Of course, this does not apply for those records following the DataCite schema but not registered in the DataCite infrastructure.

There are however some key differences on the DCAT-AP and DataCite data models that needs to be addressed. The following sections outline the solutions adopted in CiteDCAT-AP, as well as open issues.

Resource types

DataCite supports 28 different resource types, 12 of which correspond to the classes included in the DCMI Type vocabulary [[DCTERMS]].

The definition of dcat:Dataset is broad enough to cover most of the DataCite resource types, the exceptions being event, physical object, service, which are not supported in DCAT-AP (class dcat:DataService is semantically more specific than service DataCite resource type). For these resource types, it is possible to re-use the DCMI Type vocabulary, which includes classes for event (dctype:Event), physical object (dctype:PhysicalObject), and service (dctype:Service). Moreover, for compliance with [[VOCAB-DCAT-2]], they are also typed as dcat:Resource's.

CiteDCAT-AP re-uses the approach outlined above. Moreover, in order to preserve the original information, it uses dct:type with the relevant classes of the DCMI Type vocabulary to denote the DataCite resource type.

As said above, the DCMI Type vocabulary does not include classes for data paper, model, and workflow, and no suitable candidates have been found in the reference vocabularies. In order to address this issue, CiteDCAT-AP defines specific classes for these resource types (citedcat:DataPaper, citedcat:Model, citedcat:Workflow), which are specified in metadata records by using property dct:type (see ).

Identifiers

The requirements are basically the following ones:

DCAT-AP already provides a mechanism to specify primary and secondary identifiers, as well as the identifier type. More precisely:

Such solutions are basically reflecting the DataCite approach to specify identifiers. However, identifiers specified in this way are of no use for effectively linking the relevant resources. For this purpose, an option would be encoding identifiers as HTTP URIs, whenever possible. This is the case, e.g., for ORCIDs, ISNIs, and DOIs. About the ability to specifying differently primary and secondary/alternative identifiers, the resource URI can denote the primary identifier, whereas URIs corresponding to alternative identifiers can be specified by using owl:sameAs.

Based on what said above, CiteDCAT-AP specifies identifiers as follows:

Agent roles

DataCite supports three main types of agent roles, namely, creator, publisher, and contributor. The last can be further specialised by specifying a contributor "type". DataCite supports 22 contributor types, including, e.g., "contact person", "editor", "funder", "producer", "rights holder", "sponsor", "other".

DCAT-AP supports only three agent roles, namely, creator, publisher and contact point (corresponding to contributor type "contact person" in DataCite). GeoDCAT-AP includes another one of the DataCite agent roles - namely, rights holder - plus a set of agent roles defined in [[?ISO-19115]], whose semantics is not aligned with DataCite.

As a result, together, DCAT-AP and GeoDCAT-AP cover publisher, creator, and 2 contributor types, namely, contact point and rights holder. For the other ones, CiteDCAT-AP includes the following mappings:

It is worth noting that some of the DataCite contributor types cannot be specified with a direct relationship. This is the case of roles "project leader", "project manager", "project member", "researcher", "supervisor", and "workpackage leader". Such roles are not directly describing the relationship between a resource and an agent, but rather the role of the individual in the "activity" that created the resource. E.g., "project leader" can be described as "the leader of the project that created the resource".

In such cases, the approach used in CiteDCAT-AP is as follows:

In case of roles "project leader", "project manager", and "project member", the activity is additionally typed as a foaf:Project.

The following code snippet shows how contributor type "project member" is specified in CiteDCAT-AP:

Distributions

The DataCite data model does not distinguish between a dataset and its embodiment(s) ("distribution(s)", in the DCAT terminology).

As a consequence, attributes that in DCAT/DCAT-AP are specific to distributions (as format, licence, size), in DataCite are associated with the dataset. Moreover, in DataCite there is no attribute equivalent to dcat:accessURL or dcat:downloadURL. Actually, the only information that can be used to access the dataset, and, possibly, its distribution(s), is the resource DOI.

Based on this, the approach used in CiteDCAT-AP to map DataCite records is as follows:

  1. If the described resource is an event, physical object, or service (i.e., if it cannot be typed as a dataset), the notion of "distribution" does not apply. Therefore, all DataCite elements are used in CiteDCAT-AP to describe the resource. Otherwise:
  2. Each record is described in CiteDCAT-AP as a dataset (dcat:Dataset), having exactly 1 distribution.
  3. The resulting distribution gets the relevant DataCite elements (as format, licence, size), as per the DCAT/DCAT-AP schema, whereas the remaining ones are used to describe the dataset.
  4. The dataset DOI is used both as the dataset identifier / URI and as the distribution access URL.

Use and access conditions

DataCite includes a single element, namely, "rights", to specify use and access conditions. This element is also supported in DCAT-AP (dct:rights), but, in addition, specific properties are used for use conditions (dct:license) and access rights (dct:accessRights). Moreover, in DCAT-AP use conditions are associated with distributions, whereas access rights with datasets.

Based on this, CiteDCAT-AP maps by default DataCite "rights" to dct:rights. In addition, they are mapped to dct:license and dct:accessRights when DataCite rights make explicit reference to some known licences and access rights vocabularies. More precisely, the recognised vocabularies are the following ones:

Keywords and controlled vocabularies

DataCite supports the specification of both free-text keywords and keywords from controlled vocabularies.

For the latter case, DCAT-AP recommends the use of URIs, but in DataCite only textual labels are used.

To comply with the DCAT-AP recommendation, an option is to implement mappings from textual labels to URIs. However, this poses two main issues:

  1. DataCite does not require / recommend the use of specific vocabularies, nor a particular format for the textual labels.
  2. It is often the case that no URIs are available for the used vocabularies.

Such situation makes it difficult the effective implementation of vocabulary mapping.

For this reason, CiteDCAT-AP preserves keywords from controlled vocabularies as textual labels, unless attribute @valueURI is specified, or the textual label is a URI.

Mapping summary

The following section summarises the alignments defined in CiteDCAT-AP.

The alignments are grouped as follows:

The alignments supported only in the extended profile of CiteDCAT-AP are in bold.

Used namespaces

Prefix Namespace URI Schema & documentation
adms http://www.w3.org/ns/adms# [[VOCAB-ADMS]]
bibo http://purl.org/ontology/bibo/ [[BIBO]]
citedcat https://w3id.org/citedcat-ap/ CiteDCAT-AP
dcat http://www.w3.org/ns/dcat# [[?VOCAB-DCAT]]
dct http://purl.org/dc/terms/ [[DCTERMS]]
dctype http://purl.org/dc/dcmitype/ [[DCTERMS]]
foaf http://xmlns.com/foaf/0.1/ [[FOAF]]
gsp http://www.opengis.net/ont/geosparql# [[GeoSPARQL]]
locn http://www.w3.org/ns/locn# [[LOCN]]
org http://www.w3.org/ns/org# [[VOCAB-ORG]]
owl http://www.w3.org/2002/07/owl# [[OWL-REF]]
prov http://www.w3.org/ns/prov# [[PROV-O]]
rdf http://www.w3.org/1999/02/22-rdf-syntax-ns# [[RDF-CONCEPTS]]
rdfs http://www.w3.org/2000/01/rdf-schema# [[RDF-SCHEMA]]
skos http://www.w3.org/2004/02/skos/core# [[SKOS-REFERENCE]]
vcard http://www.w3.org/2006/vcard/ns# [[vCARD-RDF]]
xsd http://www.w3.org/2001/XMLSchema# [[XMLSCHEMA11-2]]
wdrs https://www.w3.org/2007/05/powder-s# [[POWDER-S]]

Reference code lists for metadata elements

DataCite metadata elements Code list URI Code lists Status
Date (type: withdrawn) http://publications.europa.eu/resource/authority/dataset-status [[!EUV-DS]] stable
Format http://publications.europa.eu/resource/authority/file-type [[!EUV-FT]] stable
http://www.iana.org/assignments/media-types [[!IANA-MEDIA-TYPES]] testing
Language http://publications.europa.eu/resource/authority/language [[!EUV-LANG]] stable
Subject http://publications.europa.eu/resource/authority/theme [[!EUV-THEMES]] stable

1st-level mappings

The mappings illustrated in this section concern the 1st-level elements in the DataCite metadata schema.

These elements specify properties / relationships that, in some cases, can be futher specialised with an attribute denoting their sub-type (e.g., the "type" of resource, the "type" of contributor, the "type" of related resource). For this reason, elements having a "type" attribute have both a default mapping for the element, and a specific mapping for the type. The default mapping is used in the following cases:

As a rule, the domain of the mappings is the one corresponding to the ResourceType element (i.e., dcat:Resource, dcat:Dataset, dctype:Service, or dctype:Event). However, "starred" elements - i.e., elements whose name is preceded by an asterisk ("*") - are those having as domain dcat:Distribution when the resource is typed as a dcat:Dataset.

Element Type Mappings Mapping status Comments
Property or RDF/XML attribute Range
Identifier @rdf:about rdfs:Resource (URI reference) testing
dct:identifier xsd:anyURI testing
dcat:landingPage rdfs:Resource (URI reference) testing If the resource is typed as a dcat:Dataset
foaf:page rdfs:Resource (URI reference) testing If the resource is not typed as a dcat:Dataset
dcat:accessURL rdfs:Resource (URI reference) testing If the resource is typed as a dcat:Dataset, the domain is dcat:Distribution
Creator dct:creator foaf:Agent testing
Title default dct:title rdf:PlainLiteral testing
AlternativeTitle dct:alternative rdf:PlainLiteral testing
Subtitle ??:?? rdf:PlainLiteral unstable TBD
TranslatedTitle dct:title rdf:PlainLiteral testing
Publisher dct:publisher foaf:Agent testing
PublicationYear dct:issued xsd:gYear testing
Subject dcat:theme skos:Concept (URI reference) testing If the subject is a URI from [[EUV-THEMES]]
dct:subject skos:Concept (URI reference) testing If the subject is a URI
skos:Concept testing If the subject is associated with a subject scheme
dcat:keyword rdf:PlainLiteral testing If the subject is not associated with a subject scheme
Contributor default dct:contributor foaf:Agent testing Only for the extended profile
ContactPerson dcat:contactPoint vcard:Individual testing
DataCollector citedcat:dataCollector foaf:Agent testing Only for the extended profile
DataCurator citedcat:dataCurator foaf:Agent testing Only for the extended profile
DataManager citedcat:dataManager foaf:Agent testing Only for the extended profile
Distributor bibo:distributor foaf:Agent testing Only for the extended profile
Editor bibo:editor foaf:Agent testing Only for the extended profile
Funder citedcat:funder foaf:Agent testing

Only for the extended profile.

This element has been deprecated in [[?DataCite-20160916]], in favour of new element FundingReference.

HostingInstitution citedcat:hostingInstitution foaf:Agent testing Only for the extended profile
Producer bibo:producer foaf:Agent testing Only for the extended profile
ProjectLeader dct:contributor foaf:Agent testing Only for the extended profile
citedcat:projectLeader testing

Only for the extended profile

The domain of property citedcat:projectLeader is class foaf:Project.

The resource is linked to foaf:Project with property prov:wasGeneratedBy.

ProjectManager dct:contributor foaf:Agent testing Only for the extended profile
citedcat:projectManager testing

Only for the extended profile

The domain of property citedcat:projectManager is class foaf:Project.

The resource is linked to foaf:Project with property prov:wasGeneratedBy.

ProjectMember dct:contributor foaf:Agent testing Only for the extended profile
citedcat:projectMember testing

Only for the extended profile

The domain of property citedcat:projectMember is class foaf:Project.

The resource is linked to foaf:Project with property prov:wasGeneratedBy.

RegistrationAgency citedcat:registrationAgency foaf:Agent testing Only for the extended profile
RegistrationAuthority citedcat:registrationAuthority foaf:Agent testing Only for the extended profile
RelatedPerson ??:?? foaf:Agent unstable TBD
Researcher citedcat:researcher foaf:Agent testing Only for the extended profile
ResearchGroup citedcat:researchGroup foaf:Agent testing Only for the extended profile
RightsHolder dct:rightsHolder foaf:Agent testing Only for the extended profile
Sponsor citedcat:sponsor foaf:Agent testing Only for the extended profile
Supervisor citedcat:supervisor foaf:Agent testing Only for the extended profile
WorkPackageLeader citedcat:workPackageLeader foaf:Agent testing Only for the extended profile
Other dct:contributor foaf:Agent testing Only for the extended profile
Date default dct:date xsd:date testing Only for the extended profile
Accepted dct:dateAccepted xsd:date testing Only for the extended profile
Available dct:available xsd:date testing Only for the extended profile
Copyrighted dct:dateCopyrighted xsd:date testing Only for the extended profile
Collected dct:temporal dct:PeriodOfTime testing

The start and end of the period of time are specified by using properties dcat:startDate and dcat:endDate, respectively.

Created dct:created xsd:date testing Only for the extended profile
Issued dct:issued xsd:date testing
Other dct:date xsd:date testing

Only for the extended profile

Added in [[?DataCite-20171023]]

Submitted dct:dateSubmitted xsd:date testing Only for the extended profile
Updated dct:modified xsd:date testing
Valid dct:valid xsd:date testing Only for the extended profile
Withdrawn dct:modified xsd:date testing

Only for the extended profile

Added in [[?DataCite-20190320]]

dct:type http://publications.europa.eu/resource/authority/dataset-status/WITHDRAWN
Language dct:language dct:LinguisticSystem testing
ResourceType default rdf:type dcat:Resource testing
Audiovisual rdf:type dcat:Dataset testing
dct:type dctype:MovingImage testing Only for the extended profile
Book rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text bibo:Book testing Only for the extended profile
BookChapter rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text bibo:Chapter testing Only for the extended profile
Collection rdf:type dcat:Dataset testing
dct:type dctype:Collection testing Only for the extended profile
ComputationalNotebook rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:InteractiveResource ??:?? unstable

TBD

Only for the extended profile

ConferencePaper rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text ??:?? unstable

TBD

Only for the extended profile

ConferenceProceeding rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text bibo:Proceedings testing Only for the extended profile
DataPaper rdf:type dcat:Dataset testing Added in [[?DataCite-20171023]]
dct:type citedcat:DataPaper testing Only for the extended profile
Dataset rdf:type dcat:Dataset testing
dct:type dctype:Dataset testing Only for the extended profile
Dissertation rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text bibo:Thesis testing Only for the extended profile
Event rdf:type dcat:Resource testing Only for the extended profile
dctype:Event
dct:type dctype:Event testing Only for the extended profile
Journal rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text bibo:Journal testing Only for the extended profile
JournalArticle rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text ??:?? unstable

TBD

Only for the extended profile

Image rdf:type dcat:Dataset testing
dct:type dctype:Image testing Only for the extended profile
InteractiveResource rdf:type dcat:Dataset testing
dct:type dctype:InteractiveResource testing Only for the extended profile
Model rdf:type dcat:Dataset testing
dct:type citedcat:Model testing
OutputsManagementPlan rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text ??:?? unstable

TBD

Only for the extended profile

PeerReview rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text ??:?? unstable

TBD

Only for the extended profile

PhysicalObject rdf:type dcat:Resource testing Only for the extended profile
dctype:PhysicalObject
dct:type dctype:PhysicalObject testing Only for the extended profile
Preprint rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text ??:?? unstable

TBD

Only for the extended profile

Report rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dctype:Text bibo:Report testing Only for the extended profile
Service rdf:type dcat:Resource testing Only for the extended profile
dctype:Service
dct:type dctype:Service testing Only for the extended profile
Software rdf:type dcat:Dataset testing
dct:type dctype:Software testing Only for the extended profile
Sound rdf:type dcat:Dataset testing
dct:type dctype:Sound testing Only for the extended profile
Standard rdf:type dcat:Dataset testing Added in [[DataCite-20210330]]
dct:type dct:Standard bibo:Standard testing Only for the extended profile
Text rdf:type dcat:Dataset testing
dct:type dctype:Text testing Only for the extended profile
Workflow rdf:type dcat:Dataset testing
dct:type citedcat:Workflow testing
Other rdf:type dcat:Resource testing
AlternateIdentifier owl:sameAs URI reference testing
adms:identifier adms:Identifier testing
default dct:relation rdfs:Resource testing
IsCitedBy bibo:citedBy rdfs:Resource testing
Cites bibo:cites rdfs:Resource testing Only for the extended profile
IsSupplementTo citedcat:isSupplementTo rdfs:Resource testing Only for the extended profile
IsSupplementedBy citedcat:isSupplementedBy rdfs:Resource testing Only for the extended profile
IsContinuedBy citedcat:isContinuedBy rdfs:Resource testing Only for the extended profile
Continues citedcat:continues rdfs:Resource testing Only for the extended profile
HasMetadata foaf:isPrimaryTopicOf dcat:CatalogRecord (URI reference) testing
IsMetadataFor foaf:primaryTopic rdfs:Resource (URI reference) testing
IsNewVersionOf prov:wasRevisionOf rdfs:Resource testing Only for the extended profile
IsPreviousVersionOf prov:hadRevision rdfs:Resource testing Only for the extended profile
IsPartOf dct:isPartOf rdfs:Resource testing Only for the extended profile
HasPart dct:hasPart rdfs:Resource testing Only for the extended profile
IsReferencedBy dct:isReferencedBy rdfs:Resource testing
References dct:references rdfs:Resource testing Only for the extended profile
IsDocumentedBy foaf:page rdfs:Resource testing
Documents foaf:topic rdfs:Resource testing Only for the extended profile
IsCompiledBy citedcat:isCompiledBy rdfs:Resource testing Only for the extended profile
Compiles citedcat:compiles rdfs:Resource testing Only for the extended profile
IsVariantFormOf citedcat:isVariantFormOf rdfs:Resource testing Only for the extended profile
IsOriginalFormOf citedcat:isOriginalFormOf rdfs:Resource testing Only for the extended profile
IsIdenticalTo owl:sameAs rdfs:Resource testing Only for the extended profile
IsReviewedBy citedcat:isReviewedBy rdfs:Resource testing Only for the extended profile
Reviews bibo:reviewOf rdfs:Resource testing Only for the extended profile
IsDerivedFrom dct:source rdfs:Resource testing
IsSourceOf prov:hadDerivation rdfs:Resource testing Only for the extended profile
Describes citedcat:describes rdfs:Resource testing

Only for the extended profile

Added in [[?DataCite-20171023]]

IsDescribedBy wdrs:describedby rdfs:Resource testing

Only for the extended profile

Added in [[?DataCite-20171023]]

HasVersion dct:hasVersion rdfs:Resource testing

Added in [[?DataCite-20171023]]

IsVersionOf dct:isVersionOf rdfs:Resource testing

Added in [[?DataCite-20171023]]

Requires dct:requires rdfs:Resource testing

Only for the extended profile

Added in [[?DataCite-20171023]]

IsRequiredBy dct:isRequiredBy rdfs:Resource testing

Only for the extended profile

Added in [[?DataCite-20171023]]

Obsoletes dct:replaces rdfs:Resource testing

Only for the extended profile

Added in [[?DataCite-20190320]]

IsObsoletedBy dct:isReplacedBy rdfs:Resource testing

Only for the extended profile

Added in [[?DataCite-20190320]]

IsPublishedIn dct:isPartOf rdfs:Resource testing

Only for the extended profile

Added in [[DataCite-20210330]]

* Size dct:extent dct:SizeOrDuration testing

If the resource is typed as a dcat:Dataset, the domain is dcat:Distribution.

Only for the extended profile.

* Format dct:format dct:MediaTypeOrExtent testing

If not specified with a IANA media type

If the resource is typed as a dcat:Dataset, the domain is dcat:Distribution.

dcat:mediaType dct:MediaTypeOrExtent (URI reference) testing

If specified with a IANA media type

If the resource is typed as a dcat:Dataset, the domain is dcat:Distribution.

Version owl:versionInfo rdf:PlainLiteral testing
* Rights dct:rights dct:RightsStatement testing If the resource is typed as a dcat:Dataset, the domain is dcat:Distribution.
Description default dct:description rdf:PlainLiteral testing
Abstract dct:description rdf:PlainLiteral testing
Methods dct:provenance dct:ProvenanceStatement testing
SeriesInformation bibo:locator rdf:PlainLiteral testing Only for the extended profile.
TableOfContents dct:tableOfContents rdf:PlainLiteral testing Only for the extended profile.
Other rdfs:comment rdf:PlainLiteral testing Only for the extended profile.
GeoLocation dct:spatial dct:Location testing
FundingReference citedcat:isFundedBy foaf:Project testing

Added in [[?DataCite-20160916]].

Only for the extended profile.

2nd-level mappings

The mappings illustrated in this section concern the 2nd-level elements in the DataCite metadata schema.

These elements, and the corresponding mappings, are grouped in the following classes:

Elements with child elements

Element Child elements Mappings Mapping status Comments
Domain Property or RDF/XML attribute Range
Creator creatorName foaf:Agent foaf:name rdf:PlainLiteral testing
givenName foaf:givenName rdf:PlainLiteral testing
familyName foaf:familyName rdf:PlainLiteral testing
nameIdentifier @rdf:about URI reference testing
affiliation org:memberOf foaf:Organization testing
Contributor contributorName foaf:Agent foaf:name rdf:PlainLiteral testing
vcard:Individual vcard:fn rdf:PlainLiteral testing If the contributor type is "ContactPerson"
givenName foaf:Agent foaf:givenName rdf:PlainLiteral testing
vcard:Individual vcard:given-name rdf:PlainLiteral testing If the contributor type is "ContactPerson"
familyName foaf:Agent foaf:familyName rdf:PlainLiteral testing
vcard:Individual vcard:family-name rdf:PlainLiteral testing If the contributor type is "ContactPerson"
nameIdentifier foaf:Agent @rdf:about URI reference testing
vcard:Individual testing If the contributor type is "ContactPerson"
affiliation foaf:Agent org:memberOf foaf:Organization testing
vcard:Individual vcard:organization-name rdf:PlainLiteral testing If the contributor type is "ContactPerson"
GeoLocation geoLocationPlace dct:Location skos:prefLabel rdfs:Literal testing
geoLocationPoint dcat:centroid gsp:gmlLiteral testing

In [[?DataCite-20160916]], this information is specified by using 2 child elements - namely, pointLatitude and pointLongitude.

Earlier versions of DataCite use a literal instead.

gsp:wktLiteral
geoLocationBox dcat:bbox gsp:wktLiteral testing

In [[?DataCite-20160916]], this information is specified by using 4 child elements - namely, northBoundLatitude, eastBoundLongitude, southBoundLatitude, and westBoundLongitude.

Earlier versions of DataCite use a literal instead.

gsp:gmlLiteral
geoLocationPolygon locn:geometry gsp:wktLiteral testing

Added in [[?DataCite-20160916]].

The polygon vertices are specified by using child element geoPolygonPoint. The coordinates of each vertex are specified by using two child elements - respectively, pointLatitude and pointLongitude.

[[?DataCite-20171023]] introduces an additional element inPolygonPoint to mark the inside of the polygon for "any bound area that is larger than half the earth". Its purpose is not necessary in the transformed geometry encoding, so this element is not mapped.

gsp:gmlLiteral
FundingReference awardNumber foaf:Project dct:identifier xsd:string | xsd:anyURI testing
awardTitle dct:title rdf:PlainLiteral testing
* funderName foaf:Organization foaf:name rdf:PlainLiteral testing

The "funding project" (foaf:Project) is linked to the "funder" (foaf:Organization) by using property citedcat:isAwardedBy.

The domain is foaf:Organization.

* funderIdentifier @rdf:about URI reference testing
dct:identifier xsd:string | xsd:anyURI testing
relatedItemIdentifier rdfs:Resource dct:identifier xsd:string | xsd:anyURI testing Added in [[DataCite-20210330]]
creator Same mapping of element Creator. testing Added in [[DataCite-20210330]]
contributor Same mapping of element Contributor. testing Added in [[DataCite-20210330]]
title Same mapping of element Title. testing Added in [[DataCite-20210330]]
publicationYear dct:issued xsd:gYear testing Added in [[DataCite-20210330]]
volume bibo:volume rdfs:Literal testing Added in [[DataCite-20210330]]
issue bibo:issue rdfs:Literal testing Added in [[DataCite-20210330]]
number The property depends on the value of attribute @numberType. rdfs:Literal testing Added in [[DataCite-20210330]]
firstPage bibo:pageStart rdfs:Literal testing Added in [[DataCite-20210330]]
lastPage bibo:pageEnd rdfs:Literal testing Added in [[DataCite-20210330]]
publisher dct:publisher foaf:Agent testing Added in [[DataCite-20210330]]
edition bibo:edition rdfs:Literal testing Added in [[DataCite-20210330]]

Elements with attributes

Element Textual content & attributes Mappings Mapping status Comments
Domain Property or RDF/XML attribute Range
creatorName @nameType = Organizational foaf:Organization foaf:name rdf:PlainLiteral testing Added in [[?DataCite-20171023]]
@nameType = Personal foaf:Person
contributorName @nameType = Organizational foaf:Organization foaf:name rdf:PlainLiteral testing Added in [[?DataCite-20171023]]
@nameType = Personal foaf:Person
affiliation textual content foaf:Organization foaf:name rdf:PlainLiteral testing
@affiliationIdentifier @rdf:about URI reference testing

Added in [[DataCite-20190816]]

dct:identifier xsd:string | xsd:anyURI testing
Date @dateInformation ??:?? ??:?? rdf:PlainLiteral unstable

TBD

Added in [[?DataCite-20171023]]

AlternateIdentifier textual content adms:Identifier skos:notation rdfs:Literal testing
@alternateIdentifierType adms:schemeAgency rdfs:Literal
@resourceTypeGeneral rdfs:Resource rdf:type, dct:type The same set of types of element ResourceType testing

Added in [[?DataCite-20171023]]

* @relatedMetadataScheme dct:Standard dct:title rdf:PlainLiteral testing

Only when @resourceTypeGeneral is HasMetadata.

The domain is dct:Standard

Added in [[?DataCite-20171023]]

* @schemeURI dct:Standard @rdf:about URI reference testing

Only when @resourceTypeGeneral is HasMetadata.

The domain is dct:Standard

Added in [[?DataCite-20171023]]

Subject textual content skos:Concept skos:prefLabel rdf:PlainLiteral testing
@schemeURI skos:inScheme skos:ConceptScheme (URI reference) testing
@valueURI @rdf:about URI reference testing Added in [[?DataCite-20160916]].
@classificationCode skos:notation rdfs:Literal testing

Only for the extended profile

Added in [[DataCite-20210330]]

* @subjectScheme skos:ConceptScheme dct:title rdf:PlainLiteral testing The domain is skos:ConceptScheme
Rights textual content dct:RightsStatement rdfs:label rdf:PlainLiteral testing
@rightsURI @rdf:about URI reference testing
* @rightsIdentifier adms:Identifier skos:notation rdfs:Literal testing

The domain is adms:Identifier

Added in [[?DataCite-20190320]]

* @rightsIdentifierScheme adms:schemeAgency rdfs:Literal testing
* @rightsIdentifierSchemeURI dct:creator URI reference testing
awardNumber textual content foaf:Project dct:identifier xsd:string | xsd:anyURI testing
@awardURI @rdf:about URI reference testing
@relatedItemType rdfs:Resource rdf:type, dct:type The same set of types of element ResourceType testing

Added in [[DataCite-20210330]]

@relationType

This attribute denote the relationship between the Resource and the RelatedItem.

The set of relationship is the same used for element RelatedIdentifier.

testing

Added in [[DataCite-20210330]]

* @relatedMetadataScheme dct:Standard dct:title rdf:PlainLiteral testing

Only when @relatedItemType (RelatedItem) is HasMetadata.

The domain is dct:Standard

Added in [[DataCite-20210330]]

* @schemeURI dct:Standard @rdf:about URI reference testing

Only when @relatedItemType (RelatedItem) is HasMetadata.

The domain is dct:Standard

Added in [[DataCite-20210330]]

number default rdfs:Resource bibo:number rdfs:Literal testing Added in [[DataCite-20210330]]
@numberType = Article ??:?? rdfs:Literal unstable

TBD

Added in [[DataCite-20210330]]

@numberType = Chapter bibo:chapter rdfs:Literal testing Added in [[DataCite-20210330]]
@numberType = Report ??:?? rdfs:Literal unstable

TBD

Added in [[DataCite-20210330]]

@numberType = Other bibo:number rdfs:Literal testing Added in [[DataCite-20210330]]

Identifiers

DataCite supports the use of persistent identifiers to denote:

In DataCite, such identifiers are specified as follows:

In CiteDCAT-AP, all these identifiers are mapped to URIs, by concatenating the identifier in the DataCite record with a URI prefix defined for each identifier type / scheme. Whenever possible, dereferenceable HTTP URIs/URLs are used; otherwise, URNs.

Notably, DataCite provides code lists for the types / schemes of identifiers used to denote resources and funders, but no code list is defined in DataCite for types / schemes of identifiers used to denote resource creators / contributors (the specification uses, as an example, "ORCID" and "ISNI").

However, DataCite does not specify a code list for scheme URIs. So, the mapping between the identifier type / scheme implemented in CiteDCAT-AP is based on the relevant registries and examples in the DataCite metadata schema specification. No URI prefix is of course used if the identifier is already a URI (as URLs and URNs).

The following table shows, for each identifier type / scheme, which is the URI prefix used in CiteDCAT-AP, along with examples of the results of such mappings. As mentioned above, all the identifier types / schemes in the table are defined as a code list in the DataCite metadata schema, with the exception of ORCID and ISNI (however, ISNI is defined in the code list for funder identifier types).

Identifier type / scheme Element(s) URI prefix used in CiteDCAT-AP Example Mapping status Comments
Original Transformed
ORCID nameIdentifier https://orcid.org/ 0000-0002-7285-027X https://orcid.org/0000-0002-7285-027X testing
ISNI nameIdentifier https://www.isni.org/ 0000000121032683 https://www.isni.org/0000000121032683 testing
affiliationIdentifier
funderIdentifier
GRID affiliationIdentifier https://www.grid.ac/institutes/ grid.270680.b https://www.grid.ac/institutes/grid.270680.b testing
funderIdentifier
CrossRef Funder ID affiliationIdentifier https://doi.org/ 10.13039/501100000900 https://doi.org/10.13039/501100000900 testing
funderIdentifier
ROR affiliationIdentifier https://ror.org/ 04j5wtv36 https://ror.org/04j5wtv36 testing
funderIdentifier
DOI Identifier https://doi.org/ 10.1016/j.epsl.2011.11.037 https://doi.org/10.1016/j.epsl.2011.11.037 testing
AlternateIdentifier
RelatedIdentifier
ARK AlternateIdentifier http://n2t.net/ ark:/67531/metapth346793/ http://n2t.net/ark:/67531/metapth346793/ testing
RelatedIdentifier
arΧiv AlternateIdentifier http://arxiv.org/abs/ arXiv:0706.0001 http://arxiv.org/abs/0706.0001 testing The URI prefix replaces the namespace prefix arXiv: in the original identifier
RelatedIdentifier
bibcode AlternateIdentifier http://adsabs.harvard.edu/abs/ 2014Wthr...69...72C http://adsabs.harvard.edu/abs/2014Wthr...69...72C testing
RelatedIdentifier
EAN13 AlternateIdentifier urn:ean-13: 9783468111242 urn:ean-13:9783468111242 unstable
RelatedIdentifier
e-ISSN AlternateIdentifier http://issn.org/resource/ISSN/ 1562-6865 http://issn.org/resource/ISSN/1562-6865 unstable
RelatedIdentifier
Handle AlternateIdentifier http://hdl.handle.net/ 10013/epic.10033 http://hdl.handle.net/10013/epic.10033 testing
RelatedIdentifier
IGSN AlternateIdentifier http://hdl.handle.net/10273/ (https://doi.org/10273/) SSH000SUA http://hdl.handle.net/10273/SSH000SUA (https://doi.org/10273/SSH000SUA) stable Added in [[?DataCite-20160916]].
RelatedIdentifier
ISBN AlternateIdentifier urn:isbn: 978-3-905673-82-1 urn:isbn:978-3-905673-82-1 unstable
RelatedIdentifier
ISSN AlternateIdentifier http://issn.org/resource/ISSN/ 0077-5606 http://issn.org/resource/ISSN/0077-5606 unstable
RelatedIdentifier
ISTC AlternateIdentifier http://istc-search-beta.peppertag.com/ptproc/IstcSearch?tFrame=IstcListing&esfIstc= A12-2014-00013328-5 http://istc-search-beta.peppertag.com/ptproc/IstcSearch?tFrame=IstcListing&tForceNewQuery=Yes&esfIstc=A12-2014-00013328-5 testing
RelatedIdentifier
ISSN-L AlternateIdentifier http://issn.org/resource/ISSN-L/ 1188-1534 http://issn.org/resource/ISSN-L/1188-1534 unstable
RelatedIdentifier
LSID AlternateIdentifier urn:lsid:ubio.org:namebank:11815 urn:lsid:ubio.org:namebank:11815 testing

LSIDs are implemented as URNs, following the pattern urn:lsid:authority:namespace:identifier:revision

URNs are URIs - no need for a URI prefix.

RelatedIdentifier
PMID AlternateIdentifier http://www.ncbi.nlm.nih.gov/pubmed/ 12082125 http://www.ncbi.nlm.nih.gov/pubmed/12082125 testing
RelatedIdentifier
PURL AlternateIdentifier http://purl.org/dc/terms/ http://purl.org/dc/terms/ testing PURLs are HTTP URIs - no need for a URI prefix.
RelatedIdentifier
UPC AlternateIdentifier urn:upc: 123456789999 urn:upc:123456789999 unstable
RelatedIdentifier
URL AlternateIdentifier http://www.heatflow.und.edu/index2.html http://www.heatflow.und.edu/index2.html testing URLs are URIs - no need for a URI prefix.
RelatedIdentifier
URN AlternateIdentifier urn:nbn:de:101:1-201102033592 urn:nbn:de:101:1-201102033592 testing URNs are URIs - no need for a URI prefix.
RelatedIdentifier
w3id AlternateIdentifier https://w3id.org/games/spec/coil#Coil_Bomb_Die_Of_Age https://w3id.org/games/spec/coil#Coil_Bomb_Die_Of_Age testing

w3id's are HTTP URIs - no need for a URI prefix.

Added in [[?DataCite-20190320]].

RelatedIdentifier

RDF vocabulary

The following sections summarise the classes and properties defined in the CiteDCAT-AP RDF vocabulary.

The vocabulary is maintained in the CiteDCAT-AP GitHub repository, and available in RDF/XML, Turtle, and JSON-LD .

The CiteDCAT-AP namespace URI is https://w3id.org/citedcat-ap/

The preferred namespace prefix is citedcat

Classes

Object properties

RDF source



XSLT implementation

This section shows the XSLT implementation of the mappings defined in CiteDCAT-AP.

The source XSLT is maintained in the dedicated GitHub repository.