Frequently Asked Questions
- What is Caliper?
- Why Caliper?
- Is Caliper an official FAO website?
- I am a statistician, what can Caliper do for me?
- I am a developer. What can Caliper do for me?
- Who works on Caliper?
- Who are the users of Caliper?
- Is Caliper the author of the classifications exposed?
- Can I use the classifications published on Caliper?
- I would like classification XYZ to be included in Caliper
- I am interested in correspondences between classifications
- BROWSING section. What is the meaning of tab "Groups"?
- Caliper hosts statistical classifications in RDF. What modelling is adopted?
- Wasn't SDMX enough?
- Does Caliper use XKOS?
- I want to know more - is there any documentation, papers or presentations available?
- Terminology - SKOS, SPARQL
- What does it mean - Vocabulary
- What is the license of Caliper?
- Does FAO have a policy on open data / data sharing?
- BROWSING - Why do you have three browsing tools?
- I have downloaded a csv from Caliper, but codes are missing leading zeroes!
- In what format should source data be, in order to be included in Caliper?
- Geographical grouping also depends on years (ex. EU members). What do you do about this?
- How can I access Caliper? I was asked for a login.
- Are you planning to include different version of HS?
- Are there or will there be commitment to persistent URIs?
- What are the skills required to maintain Caliper?
- Maintenance of statistical classifications into Caliper. How much the tools available allow editors to ignore the RDF technology behind?
- Do you implement Linked Data Platform Collections (LDPC) W3C recommendation (read/write) for Registers API ?
- Is VocBench already fully integrated in term of content maintenance and publication to Caliper?
- Caliper includes WRB 1998, are there plans to have later versions, and include correspondences?
Caliper is a web platform to test new ways to work with statistical classifications relevant to agriculture. We are interested in testing the use of open and standard formats along the entire life cycle of statistical classifications. Caliper hosts:
- a browsing/searching interface (SKOSMOS, an open source project developed by the National Library of Finland)
- an web-based editing tool (VocBench, an open source project developed by the University of Tor Vergata, Italy)
- a download area
- a SPARQL query area
If you are interested in other functionalities/tools, please do not hesitate to contact us.
Despite their importance, statistical classifications have not received much attention in the effort of modernization of official statistics. Caliper tries to address that gap. We aim at making statistical classifications available in formats that are fully machine readable, and easily accessible by humans for consultation and reuse. This work contributes to making statistical data better interoperable.
NO. This is experimental work. All data is in draft state and not to be used as reference.
Caliper can help you in the following tasks:
- Look up for classifications items. For this, you can choose three different ways:
- Look up for correspondences among classifications.
- download classifications or mappings in your format of choice, from section Download
The whole list of classifications in Caliper is in section Classifications. For each classification, you fill find all services associated.
As a developer, you will be interested in ways to automatically access the resources in Caliper. Currently, the following are implemented:
- Access the SPARQL endpoint and test what you can retrieve. A number of sample queries are given for you to start testing.
- Access classifications through the content negotiation mechanism
For documentation concerning the data model, see page Documentation.
The platform Caliper is part of a project carried on at the Statistics Division of the Food and Agriculture Organization of the UN and funded by the Bill and Melinda Gates Foundation. The University of Tor Vergata provides technical and scientific support to it.
Caliper is an experimental project, not yet part of the official services supported by FAO and therefore not yet used in FAO information systems. However, tests are ongoing to use it in conjunction with applications manipulating statistical data within FAO.
NO. The classifications published in Caliper are maintained and published by dedicated institutions (sometimes in collaborations with FAO but not necessarily). No classification or correspondence was developed within this project. Occasionally, we may have multilingual contents added to the original classifications for testing purposes (e.g., the Spanish translation for CRS Purpose Codes).
You're welcome to test our work. But pleases remember that the classifications presented on this web site do not replace in any ways the versions distributed by their original maintainers.
If XYZ is an international classification relevant to agriculture, maybe we are already working on it or have a plan to include it - or maybe not :) Either way, please do not hesitate to contact us!
Correspondences are expressed in two ways:
- using properties from SKOS. These are: skos:exactMatch for 1-1 correspondences, skos:closeMatch for partial correspondences.
- using properties and entities defined in XKOS.
The tab "Groups" displays concepts belonging to the SKOS structure "skos:collection". We are using this feature of SKOS to experiment with the possibility of marking classifications' fragments / subsets specific to some needs. See for example the group of concepts relevant to the FAO Fisheries and Aquaculture Dept, in CPC v2.1:
NOTE: Currently, concepts in a skos:collection are visualized in SKOSMOS as a flat list, although the hierarchical information remain available (see main panel, to the right).
Fundamentally, the statistical classifications available in Caliper are rendered as SKOS Concept Schemes. We use SKOS to express: hierarchies, classifications entries, labels, explanatory notes, definitions, and correspondences. In summary:
- items are skos:Concept, endowed with labels in different languages, definitions, notation, documentation notes (change, editorial, history...)
- the hierarchical structure of a classification is expressed by means of the standard SKOS properties skos:narrower, skos:broader.
- mappings between items in different classifications are expressed using the SKOS standard properties for semantic relations (skos:closeMatch and skos:exactMatch)
- subsets of a classification are defined by using SKOS collections.
XKOS is used in addition to SKOS to express correspondences between classifications, and specific sets of items in them.
All resources you can find in Caliper are endowed with metadata, which may refer to:
- a classification as a standard (say, its intellectual content), or to
- the specific format in which it is encoded (say, CSV or RDF)
To see the difference, compare: "UNSD is the author of CPC2.1" with "Caliper is the author of the RDF version of CPC2.1".
In order to be able to express these two types of statements, we use elements from DCAT - dcat:Dataset (for the CPC authored by UNSD), and dcat:Distribution (for the CPC rendered as RDF by Caliper). See the graphics below:
Other vocabularies used: Dublin Core and the Ontology Web Language (OWL) to express certain pieces of metadata (ie. title, creator, publisher, description, date of creation, date of last update, version, history notes...).
SDMX is an XML-based standard for disseminating statistical data. The equivalent of SDMX in RDF is the RDF Data Cure (W3C Recommendatiton, 2014), which implements the SDMX cube model as Linked Data.
XKOS is the RDF vocabulary specific for statistical classifications, endorsed by DDI. No XML-based equivalent of SDMX for statistical classification exists.
Not yet, the RDFs in Caliper use "plain" SKOS, because XKOS, "An SKOS extension for representing statistical classifications" was officially released by DDI in May 2019, much after the start of the project Caliper. We plan on implementing it XKOS very soon.
- on the modelling of SKOS schemes adopted in Caliper
- on VocBench, the editing software we use:
- Installation: http://vocbench.uniroma2.it/documentation/
- VocBench User Management. http://vocbench.uniroma2.it/doc/user/users.jsf
- Shee2RDF. http://art.uniroma2.it/sheet2rdf/documentation/
- SKOS implementation: http://vocbench.uniroma2.it/doc/skos.jsf
- Loddy, for content negotiation: https://bitbucket.org/art-uniroma2/loddy/downloads/Loddy_DEMO_2016-03-01.zip
- GraphDB, the triple store behind VocBench: http://graphdb.ontotext.com/documentation/standard/
- On Fuseki, the SPARQL Endpoint available as back-end: https://jena.apache.org/documentation/fuseki2/
- on SKOSMOS, the browsing tool: https://github.com/NatLibFi/Skosmos/wiki
- on Drupal, the content management system powering this website and used to display correspondences between classifications. https://www.drupal.org/
- Caliper was presented at the International Conference on Agricultural Statistics (ICAS VIII), New Delhi (India). Paper: "Open Classifications for Open Statistical Data"; Caterina Caracciolo, Valeria Pesce, Mukesh Srivastava, Carola Fabi. Abstract (pdf). Presentation (link to GoogleDrive presentation).
- Presentation given in FAO, Statistics Division, December 2018.
SKOS stands for "Simple Knowledge Organization System (SKOS). It is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data." (From Wikipedia).
SKOS defines classes and properties for representing:
- a "concept scheme" (a set of terms: a classification, a code list, a list of subject headings...)
- its terms / concepts (labels in different languages, definition, notation, editorial notes...),
- the relationships between concepts (generic or hierarchical)
- subsets of concepts (collections)
It is therefore suitable for representing classifications in a semantic, machine readable way.
SPARQL. Is the query language for RDF.
In everyday language, a vocabulary is a set of words, possibly used by a group, individual, or work, or in a field of knowledge (See the definitions given by Merriam-Webster dictionary). Vocabularies are then fundamental to shape the universe of discourse of people, and have a special role in the field of information management, especially in the form of controlled vocabularies, i.e., selected list of words used as "tag" or "classifier" of information unit - numeric or textual data. Because of their role in defining the entities to measure and codifying data, statistical classifications can be considered as special types of vocabularies.
Also in the area of information management and in the semantic web, vocabularies play a very important role. The World Wide Web Consortium (W3C) Vocabularies are defined in this broad sense by the W3C: "On the Semantic Web, vocabularies define the concepts and relationships (also referred to as “terms”) used to describe and represent an area of concern. Vocabularies are used to classify the terms that can be used in a particular application, characterize possible relationships, and define possible constraints on using those terms. In practice, vocabularies can be very complex (with several thousands of terms) or very simple (describing one or two concepts only)."
Moreover, the W3C usefully distinguishes two types of vocabularies:
- value vocabularies or sets of controlled values used to categorize and classify things. These are also known as Knowledge Organization Systems (KOS) and include classifications, code lists, thesauri, even certain types of ISO standards that prescribe controlled lists of values;
- metadata element sets that prescribe what features or properties should be used to describe things. They are also called schemas, or description vocabularies. XML schemas and RDF schema, formal languages to describe entities in XML and RDF respectively. Other example include ontologies, application profiles, and UML models.
The statistical classifications that are the focus of Caliper fall under the first type. SKOS, the formal language we used to express statistical classifications in a machine-readable format, is an example of the second type. Specifically, SKOS is a vocabulary for RDF, tailored to express thesauri on the web.
Caliper is not an official FAO website and the data it contains should be considered as experimental. For this reason, no license is explicitly given.
Because each tool has different features and advantages. Briefly:
Skosmos: offers a neat hierarchy-like visualization, and search by code and labels in different languages. It is oriented to display SKOS thesauri, does not support OWL or SKOS-XL. Correspondences are shown together with the classification entry they refer to.
PMKI: it is the read-only version of VocBench, then it allows users to see everything present in the editor environment, including OWL ontologies and different SKOS concept schemes.
Drupal: being the content management system used to build the website, it ensure uniformity of the look-and-feel of the entire Caliper space. It is highly customizable. Correspondences may be visualized and search independently of the classifications they refer to.
You probably opened the csv from Excel. If you opened it with a text editor, you would see all leading zeroes correctly in place. To be able to see them in Excel too, follow the instructions given by the Office Support website.
Caliper works well with all formats commonly used to store or pass classifications around, such as CSV, XLS, DB dump, or JSON. However, you should also consider *how* the data is organized internally (for example, how the classification hierarchy is rendered, or in how many files the classification is split) because while all formats will ultimately be converted, some may require more effort than others (as a general rule, the more ad-hoc your structure is, the more effort will be needed for conversion).
True. International organizations based on membership do have a temporal dimension, as over time new members join and others leave. This is something we plan on including in Caliper. We expect to use the same mechanism already adopted for managing groups and alternative membership, such as in the case of the alternative membership of SDG country grouping and UNSD M49.
We also have another line of work related to the time dimension of geographical information, for "former countries" (countries no longer existing as political entity because of some changes, in their name, territorial extension etc.). We would like to include those countries in their temporal and geographical context, for example to be able to extract the composition of geographical groups at a given time.
You are right. Data in Caliper is password-protected because it is not yet an official service of FAO. However, this does not mean that you are not welcome to access Caliper, on the contrary! When you are prompted for login and password, use "caliper" and "caliper". When you are prompted for only one password, use "caliper". Easy :)
We have included some parts of HS in Caliper, out of all three versions - 2007, 2012, 2017. You see only one entry point for HS in Caliper because it is internally managed a single skos:ConceptScheme, internally organized into subset (skos:Collection). Note that the three versions are not included in full, but only the parts that are "targets" of mappings given for other classifications. For example, given the mappings tables CPC2.1 -> HS 2017, we extracted all items in HS corresponding to some items in CPC. This was done mostly for our convenience. HS is actually not freely available as a whole.
Persistent identifiers will be needed once Caliper is an official service of FAO. At the moment, since it is an experimental project, we do not guarantee for the persistence of URIs of Caliper.
Since Caliper provides different functionalities, different skills are involved, but an understanding of the RDF technology stack is essential to all. In some more detail:
- For the first conversion of classifications/mapping into RDF Linked data Vocabulary, a deep knowledge of those technologies and modelling practices is a must, together with deep understanding of features and requirements of statistical classifications. This is also needed to plan for improvement of available features or for new extensions. The actual conversion from any file into RDF for Caliper is achieved programmatically, either via Python scripts or using the Sheet2RDF tool available inside VocBench.
- Once converted into RDF datasets, most maintenance will be done by content experts, those who normally maintain classifications, or perhaps translators if translations are needed.
- Once converted into RDF datasets, the conversion into alternative formats (other than RDF serializations) for download is done programmatically.
- The site is maintained in Drupal, a content management system that implements many functionalities to interact with RDF data. For example, the lookup on mappings is implemented directly in Drupal.
- The maintenance of the IT infrastructure requires good sys admin skills together and solid understanding of RDF technology stack.
Maintenance of statistical classifications into Caliper. How much the tools available allow editors to ignore the RDF technology behind?
The editing tool used in Caliper is VocBench (version 3). It is a powerful tool, able to support the editing of both classifications (as RDF vocabularies), and the OWL model (ontologies) the use, when this is the case. VocBench also fully supports editorial workflows, so that some users will only be able to add translations, for example, while other are allowed to approve changes and perform more complicated operations. Therefore, the level of knowledge of RDF required for the maintenance of classifications in VocBench very much depends on your role in the project. We have developed guidelines for editors, currently under testing within the FAO Statistic Division.
Do you implement Linked Data Platform Collections (LDPC) W3C recommendation (read/write) for Registers API ?
Yes. The first step for the inclusion of classifications into Caliper is a conversion from format x into RDF. Immeditely, the RDF is included in a VocBench project, where it may be further enriched (here, the Seet2RDF tool integrated in VocBench is very useful) or checked for integrity etc. All projects are then finalized in VOcBench and from there passed on to all other services included in Caliper.
Yes, there is a plan to include WRB 2006, considered the most used in the area of soil science. As for correspondences, we rely in input from soil scientists!