The same basic modelling is used for all classifications. In essence, this relies on SKOS to model items, their properties and the hierarchical structure of the classification, and on XKOS to model pieces of information that are specific to statistical classifications.
The SKOS backbone
Here a summary table of the SKOS elements used, together with the information they model:
|Classification's elements||SKOS elements|
|Hierarchy: item A is more general than item B||
URI_A skos:broader URI_B
|Hierarchy: B is more specific than A||
URI_B skos:narrower URI_A
|1-1 correspondence between A, B||
URI_A skos:exactMatch URI_B
|1-n, n-1, n-n correspondences||
More in details:
A single classification (version of):
- rendered as one or more skos:ConceptScheme. More details are given for each specific classification.
- Its URI is formed by concatenating our domain name + the classification acronym + the classification's version (e.g. "domain name" / "cpc" / "cpc_v2.0")
- rendered as a skos:Concept
- its URI is formed by concatenating our domain name + the items' code.
- Note: if the code contains "dots" (".") they are replaced with "hyphens" ("-").
- Each skos:Concept is given the following properties:
- titles in English or other languages when available, expressed using the property skos:prefLabel, with appropriate language tag
- an explanatory note (skos:scopeNote)
- codes are values of the property skos:notation
- rendered by means of skos:narrower and skos:broader.
The graphic below exemplifies this (for simplicity, the picture only shows skos:broader, and for each concept, it only shows one skos:prefLabel and its skos:notation):
Correspondences are rendered in two ways:
with SKOS, using the properties skos:exactMatch (for 1-1 mappings) and skos:closeMatch (for all other mappings).
- with XKOS, using the specific constructs dedicated to rendering correspondences between statistical classifications.
Note that while SKOS matching properties are stated at the level of corresponding items only, XKOS makes explicit statements about the classifications involved in the mapping. Therefore, if only SKOS is used, the information that correspondences exist between classifications A and B must be "inferred" from the data (i.e., at least one triple exist having skos:exactMatch or skos:closeMatch as predicate). On the contrary, if XKOS is used, that very same piece of information is made explicity by an entity of rdf:type xkos:Correspondence. In short, XKOS is more oriented to be consumed programmatically, by computer applications.
The comment above should make clear the reason for adopting the double formalization of SKOS and XKOS. Let's take a closer look at the XKOS constructs featuring in XKOS v1.0
- Instances of the class xkos:Correspondence - representing a pair of classifications for which correspondences exist.
- Property xkos:compares - to link an instance of xkos:Correspondence to the two classifications (skos:ConceptScheme) involved in the mapping.
- Instances of the class xkos:ConceptAssociation - to represent the actual items involved in the correspondence.
- Property skos:madeOf - to link together a correspondence with all actual correspondences established.
- xkos:source and xkos:target (Properties) specify, for each concept associations, its source and target items.
The picture below provides a simplified view of the XKOS modelling just described in words.
The picture above represents the mapping established from two classifications, "Classification A" and "Classification B", consisting of only one actual correspondence "CorrA-B_1". The right most side of the graphics shows the two items involved in the mapping ("Item_a" and "Item_b", belonging to "Classification A" and "Classification B" respectively).
By metadata, we mean all pieces of information concerning the object at hand, such as title, author, abstract, (aka descriptive metadata), versioning information (aka structural metadata), date of creation or licence (aka administrative metadata). In the case of the classifications in Caliper, we also need to distinguish:
- the intellectual content of the classification, i.e., the actual classification (e.g., the hierarchy of products in CPC, together with their code and labels)
- the way that content is modelled in RDF (e.g., how the hierarchy is rendered in RDF, which RDF constructs are used for that)
- the specific representation (aka serialization) of the RDF (e.g., whether the classification is made available using the csv or json format)
Note that those three parts have different characteristics, as for example the author of a classification (its intellectual content) may be different from the author of the RDF rendering (the model). Therefore, different metadata is appropriate for each of them. The picture below provides a sketch of the way these different pieces of information are rendered in Caliper.
As highlighted in the picture, different metadata schemes are used for different purposes:
- DCAT is used to describe generic datasets
- VOiD is used to describe RDF datasets