Caution! This site is PROVISIONAL and provided for purposes of EVALUATION ONLY.

Documentation of RDF modelling adopted

Here a brief description of the models used when converting statistical classifications into RDF data.

Methodology

All classifications included in Caliper have been converted into RDF resources. The conversions adopted relies on two standards: 

  1. SKOS
  2. XKOS

Other public vocabularies/standards adopted are:

  1. OWL
  2. Dublin Core
  3. FOAF
  4. DCAT
  5. VOID

Classifications (hierarchy & content)

Most classifications in Caliper are hierarchical and are modelled using the SKOS structure of broader/narrower relationships. The figure below offers a simplified view of hierarchies in RDF/SKOS.  

Skos structure for classifications
Schematic view of a classification structure in SKOS.

The picture above shows a fragment of hierarchy consisting of only three concepts (classification's entries), each represented together with their title (skos:prefLabel) and code (skos:notation).

More in details: 

A single classification (version of):

Each (version of a) classification is rendered as one or more skos:ConceptScheme.
  • Its URI is formed by concatenating our domain name + the classification acronym + the classification's version (e.g. "domain name" / "cpc" / "cpc_v2.0")

Classification's items:

A classification's item is a skos:Concept

In general, the local part of the URI is formed by the code of the item.

  • Note: if the code contains "dots" (".") they are replaced with "hyphens" ("-").
  • Each skos:Concept is given the following properties:
    • titles in English or other languages when available, expressed using the property skos:prefLabel, with appropriate language tag
    • an explanatory note (skos:scopeNote)
    • codes are values of the property skos:notation

Hierachies:

The hierarchical structure of the classification is rendered by means of a the standard SKOS properties skos:narrower and skos:broader..

 

Correspondences

Correspondences are rendered using both SKOS and XKOS properties.

Using SKOS properties skos:exactMatch (for 1-1 correspondences); skos:closeMatch in all other cases.

Note that those SKOS properties hold only at the level of items.

Using the following constructs of XKOS:
- Class xkos:Correspondence. An instance of this class represents a pair of classifications for which correspondences exist. 
- Property xkos:compares. It links an instance of xkos:Correspondence to the two classifications (skos:ConceptScheme) involved in the mapping. 
- Class xkos:ConceptAssociation. An instance of this class represent the actual items involved in the correspondence. 
- Property skos:madeOf links together a correspondence with all actual correspondences established.
- Properties xkos:source and xkos:target specify, for each concept associations, its source and target items.  

The picture below provides a simplified view of the XKOS modelling just described in words.

Correspondences modelled using XKOS
Sketch of XKOS modelling of correspondences.

The picture above represents the mapping established from two classifications, "Classification A" and "Classification B", consisting of only one actual correspondence "CorrA-B_1". The right most side of the graphics shows the two items involved in the mapping ("Item_a" and "Item_b", belonging to "Classification A" and "Classification B" respectively).

Metadata

By metadata, we mean all pieces of information concerning the object at hand, such as title, author, abstract, (aka descriptive metadata), versioning information (aka structural metadata), date of creation or licence (aka administrative metadata). In the case of the classifications in Caliper, we also need to distinguish: 

  1. the intellectual content of the classification, i.e., the actual classification (e.g., the hierarchy of products in CPC, together with their code and labels)
  2. the way that content is modelled in RDF (e.g., how the hierarchy is rendered in RDF, which RDF constructs are used for that)
  3. the specific representation (aka serialization) of the RDF (e.g., whether the classification is made available using the csv or json format)

Note that those three parts have different characteristics, as for example the author of a classification (its intellectual content) may be different from the author of the RDF rendering (the model). Therefore, different metadata is appropriate for each of them. The picture below provides a sketch of the way these different pieces of information are rendered in Caliper.

Caliper metadata model
Caliper metadata

As highlighted in the picture, different metadata schemes are used for different purposes:

  • DCAT is used to describe generic datasets
  • VOiD is used to describe RDF datasets

 

Modelling of individual classifications

Central Product Classification v2.0 (CPC2.0)

Follows the general modelling described above. One concept scheme only (http://stats-class.fao.uniroma2.it/CPC/v2.1/core).

Central Product Classification v2.1 (CPC2.1)

Same structure of CPC v2.0, but organized into three concept schemes:

Scheme for CPC v2.1 "core"

The concept scheme for all items in CPC2.1 "core", corresponding to the "main" CPC 2.1 as published by UNSD

Scheme for CPC v2.1 Expansion for agriculture statistics

This is the concept scheme for all items of interest to agriculture, as per the structure developed by FAO and published by UNSD in the CPC v2.1 official document (cf. pg. 586 and beyond). This scheme includes all "expanded codes" (ie., the 7-digit codes proposed by FAO), their parents up to the root of the classification (so as to enable proper aggregation of data), and other relevant items.

The figure below provides a schematic representation of the content of the two schemes, the core and the one for the agriculture expansion. 

The concept scheme for CPC v1.2 "core" and for its expansion for agricultural statistics
A sketch of the content of the schemes for CPC v2.1 "core" (blu circle) and for the expansion for agricultural and rural statistics (red circle).

 The red nodes of the tree represent the new items (7-digits), while the red circle shows what the scheme for the CPC v2.1 includes - all the new codes, plus their parents up to the first level, and any other item deemed relevant to agriculture.

Scheme for CPC v2.1 fertilizers

This concept scheme includes the items specifically relevant to statistics on fertilizers according to FAO. Those include - a) items that are part of the "core" CPC2.1 core (5-digit), b) items with 7-digit codes, c) plus their parents up to the root of the classification (so as to enable proper aggregation of data)

Subsets

Three subsets of items are also defined, and modelled as skos:collection

Collection "expansion_only": it includes only the 7-digit items as proposed by FAO for the "Expansion for agricutlure and rural statistics"

Collection "fisheries": only items relevant to fisheries and aquaculture (as used by the FAO Fisheries and Aquaculture Department)

Collection "fert_expansion_only": it only includes the 7-digit items that are fertilizers and that are not included in the official "Expansion". This collection was introduced for internal use, to have an RDF representation of a possible future proposal to UNSD regarding the field of fertilizers.

Creditor Reporting System (CRS), Sector and Purposes (2015-06)

Follows exactly the general modelling described above. One concept scheme only.

Creditor Reporting System (CRS), Sector and Purposes (2018-01)

Follows exactly the general modelling described above. One concept scheme only.

It included labels in Spanish, resulting form a work of translation done internally in FAO. 

    Creditor Reporting System (CRS), Sector and Purposes (2020-01)

    Follows exactly the general modelling described above. One concept scheme only.

    It included labels in Spanish and French, resulting form a work of translation done internally in FAO. 

      FAOSTAT Commodity List

      Follows exactly the general modelling described above. One concept scheme only.

      The original structure of flat list of items organized into groups is rendered as a two-level hierarchy (skos:broader/skos:narrower).

      FoodEx2

      This RDF model adopted in this conversion was originally devised within a project run by the UK Food Safety Agency. In this model:

      1. Each main hierarchy of the classification is represented as a separate hierarchy, with specific narrower/broader relations (defined as subproperties of the standard skos:narrower and skos:broader), belonging to separate concept schemes
      2. Each set of facets is also a separate hierarchy belonging to separate concept schemes
      3. Items (skos:concept) may be part of more than one concept scheme
      4. items have a definition in English (skos:definition), and a scope note (skos:scopeNote)

      Forest Products Classification and Definitions (2016)

      Follows exactly the general modelling described above. One concept scheme only.

      Harmonized System

      This modelling differ from those adopted for all the other classifications, in that HS items from all versions (2007, 2012, 2017) belong to the same, plus one or more of the three skos:collection (one per HS version) in which they are valid. 

      Indicative Crop Classification (ICC) v1.0

      Follows exactly the general modelling described above. One concept scheme only.

      Indicative Crop Classification (ICC) v1.1

      Follows exactly the general modelling described above. One concept scheme only.  

      ISICRev4

      Follows exactly the general modelling described above. One concept scheme only.

      M49 2019

      Currently, M49 is modelled using a few different concept schemes. Concepts may belong to one or more scheme, depending valid in all schemes:

      1. M49UNSD. Contains
      2. M49FAO
      3. M49FAO-2019-07
      4. M49FAO-2019-12
      5. M49-EX

      Concept scheme may include both countries and regions.

      Each geographical area is a skos:Concept, whose URI's local part is its M49 code

      1. Each country is also typed according to the "geopolitical ontology" (rdf:type geopol:self_gov_territory)
      2. countries have:
        1. (official) names in various languages (all skos:prefLabel)
        2. short names in various languages (properties from the ontology)
        3. dates of validity (properties from the ontology)
        4. different properties for codes in different coding systems (properties from the ontology)
        5. Membership of territories in larger areas is represented by means of skos:broader and skos:narrower. 

      Note: 

      • territories no longer existing are also included

      World Census of Agriculture Crop List v1.0 (WCA) 

      Entries in the list are given two types (rdf:type) - skos:Concept and dwc:Taxon. Each item has:

      1. an URI, which local part is English name of the item
      2. a scientific name and a common name, each rendered both with properties taken from SKOS and from Darwin Core (skos:prefLabel, dwc:scientificName and skos:altLabel, dwc:vernacularName respectively)
      3. a code (skos:notation)

      World Reference Base (WRB) 1998

      The 1998 version of WRB distinguishes 30 soil groups (each is an skos:Concept, skos:topConcept of the scheme). Each soil group has:

      1. as URI, which local code is a sequential number (representing the "order" of selection of the soil group)
      2. an English name (skos:prefLabel),
      3. a definition in English (skos:definition)
      4. 2-digit acronym (skos:altLabel),
      5. an order in sequence of assignement (wrb:hasOrder)
      6. a given number of qualifiers (each of them are skos:Concept), represented as children in the hierarchy (skos:narrower).

      Each qualifier has:

      1. an URI, which local part is a two-digit acronym
      2. an English name (skos:prefLabel)
      3. a definition in English (skos:definition)
      4. a property with numeric value (wrb:hasOrder) representing the order in sequence of assignement to the soil group.

      Note:

      • Not all qualifiers are actually used to qualify a soil group.
      • A qualifier may qualify more than one soil group. 
      • Planned improvement: distinguish the order of assignment of qualifiers for each respective soil group. 

      World Reference Base (WRB) 2014 (Update 2015)

      Soil groups and qualifiers are both skos:Concept, and their English name is taken to form the local part of the URIs. Each skos:Concept is provided with an English name (skos:prefLabel), a definition (skos:definition), and an acronym (skos:notation). Soil groups and qualifiers are organized into two separate schemes, and connected by dedicated properties defined in a the WRB ontology. Soil groups are the domain of properties wrb:hasPrincipalQualifier, and wrb:hasSupplementaryQualifier, the range of which are qualifiers (types are also defined in the WRB ontology). The corresponding inverse properties link together qualifiers and soil groups. Where appropriate, pairs of qualifiers are linked with a hierarchical relation of broader/narrower (e.g., Abruptic, Geoabruptic).