Describing a Schema for RVL

This post is about the schema of RVL and discusses alternatives for defining this schema. It is not about the features of RVL itself. For place limitations, this discussion could not be published in the RVL paper. As there are now efforts from the Dublin Core Metadata Inititative (RDF Application-Profiles Group) as well as from the W3C (Data Shapes Working Group) towards a standard for defining prescriptive constraints for the RDF technical space, this blog post serves as a place to describe RVL as a case study. A summary of requirements derivable from the RVL use cases is given at the end of this post.

In order to clarify the distinction between mappings defined with RVL, the RVL schema, and languages used to define this schema, the following table gives an overview on the different “language levels” in the context of RVL. We will look at the first row, i.e., languages that can be used to define schemata:

Language level Name Example (natural language)
Schema languages RDF(S) / OWL / SPIN An rvl:PropertyMapping is an owl:Class .
A rvl:PropertyMapping
has exactly 1
rvl:targetGraphicRelation .
Schema RVL rvl:PropertyMapping
Model Concrete RVL mappings ex:Cost2LightnessMapping

For defining the schema of RVL, multiple options existed, which we discuss in the following, based on three additional requirements that we put with respect to the schema. Finally, we shortly describe our approach for defining the RVL schema using a combination of OWL axioms and SPIN constraints.

  • LR-15 The schema of the language must be restrictive and expressive enough to derive tooling from it.
  • LR-16 The schema language must be aware of ontology semantics, not only URIs.
  • LR-17 Constraints in the mapping language’s schema and constraints defined in VISO/facts should be handled consistently

First, in order to allow for the generation of mapping editors from the language descrip- tion, the RVL schema should define, in a tool-usable way, what is a valid mapping in RVL (LR-15). Allowing for the derivation of an editor from a languages’ schema contributes to the extensibility of a language, since it reduces redundancy and allows for changing the editor automatically along with the language definition. By using constraints directly from the RVL schema, we avoid that this knowledge has to be hard-coded again in the source code when implementing a guided editor for the mapping language.

Second, since we want to easily define constraints on our mapping types, the schema language should be aware of ontology semantics (LR-16), and not only aware of URIs. It will frequently occur that constraints of the mapping language have to reference VISO/graphic terms such as in the following constraint (here defined in natural language):

“An rvl:PropertyMapping
always maps an rdf:Property
to a viso-graphic:GraphicRelation.”

Third, an additional requirement exists because we want to use RVL in a semi-automatic visualization system: External rules on graphic syntax and human perception that are based on facts from the VISO/facts module will have to be accessed for constructing editors. Therefore, we require that both the constraints from the mapping language’s schema and the constraints defined in VISO/facts can be handled consistently (LR-17).

One option was to stay completely within the RDF-based ontology technical space. This is suggested by the fact that both the source data and the graphic elements we are mapping onto are RDF-based. Defining also our mapping language RVL with RDF-based technolo- gies, therefore, could help avoiding technological breaks. Defining restrictions that use terms from the source ontologies and VISO could easily be done, e.g., via OWL class restrictions and also the Domain and Range of properties could easily be stated with OWL. An additional benefit is that mapping definitions, instantiating an RDF-based vocabulary, could conveniently be shipped along with the data they are visualizing. Furthermore, since each mapping was an RDF resource, globally uniquely identified via a URI, linked data principles would apply to it. This would contribute to the requirement of shareable mappings, since other users could dereferentiate mappings and reuse them in their own visualisations. The authors of Fresnel chose this approach and defined the vocabulary using OWL (cf. right column of Table 2). The problem with this approach is that class restrictions and domain–range settings defined in OWL are not meant to prescribe valid user input, but to derive new knowledge under the open world assumption. For this reasons, we do not consider OWL (alone) appropriate to define the RVL language, since the regular OWL semantics and the corresponding tools are not applicable. While OWL may also be interpreted with different (closed world) semantics and specific tooling could be built, OWL also lacks constructs such as defaults and attributes for conveniently defining a rich prescriptive schema.

Another option was to write the RVL schema in a different technical space, such as the meta-modeling technical space or the grammar technical space, and only reference the ontology resources via their URIs (left column of Table 2). If the mapping language was defined by grammar rules or meta model constraints, under a closed world assumption, tooling for constrained-based guidance (editors, warning messages, auto-suggest functions) could conveniently be generated based on these constraints with existing technologies. Around ECore, as as popular base for meta-modeling, the Eclipse Modeling Framework and many frameworks on top of it support building textual or graphical editors for ECore-based languages. On the downside, when this means that ontologies need to be transformed, (e.g., to ECore), it will be difficult with this approach to dynamically adapt to extensions of the ontological models. Re-modeling ontologies in ECore is a drawback, when we want to access changing external knowledge bases and consider facts stored in these knowledge bases in our constraints.

We chose the first option and decided to stay within the ontology technical space. However, we use RDFS/OWL only for modeling the abstract syntax of RVL, and use SPIN for defining the constraints of the RVL schema (center column of Table 2). With TopBraid Composer a modeling environment is available that can be used to build an editor that supports syntactic guidance for creating valid RVL mappings and at the same time allows for conveniently accessing VISO/graphic resources as well as visualization rules from the VISO/facts knowledge base. Table 2 summarises our comparison of the three options described above under the aspects of expressiveness, use of standards, support of shareability of mappings, the availability of tooling and the support of guidance based on both schema knowledge and external facts from existing knowledge bases. In the left column we use the concrete solution of OWLtext as an example for the second approach.


Table 2 - Comparison of three options foor specifying the RVL schema

Table 2 – Comparison of three techniques to define schemata and derive tooling from these schemata: The right column represents the approach of staying completely within the Ontology technical space, exemplified by the solution chosen to define Fresnel. (FSL stands for the Fresnel Selector Language) The left column represents the approach of bridging the Ontology and Metamodelling technical space, exemplified by OWLtext. The center column shows our choice of using OWL in combination with SPIN for the definition of constraints for combined Open and Closed world reasoning.


Concrete Examples of the RVL Schema defined with
RDFS/OWL and SPIN

In the following, we provide a set of concrete examples to illustrate which parts of RVL are defined with RDFS/OWL and which parts with SPIN and how a SPIN constraint can be defined. Types and relations of RVL are defined with RDFS and OWL:

rvl:PropertyMapping rdfs:subClassOf rvl:Mapping .

rvl:sourceProperty a rdf:Property .

rvl:Clamp a rvl:OutOfBoundHandlingType ;
   rdfs:label "clamp"^^xsd:string ;
   dct:description "Values outside the defined interval are set to
      the boundaries of the interval."^^xsd:string .

In order to prescripe how a mapping type must be used, SPIN is used as a constraint language. SPIN first of all allows for storing SPARQL queries as RDF. However, additional properties, such as spin:constraint and spl:Attribute enable the definition of prescriptions such as attributes, which constrain the usage of certain properties in the context of a specific class. In the following listing we show how SPIN can be used to express the example constraints we already (partially) introduced above as natural language. It states that a rvl:PropertyMapping always maps exactly one rdf:Property to exactly one viso-graphic:GraphicRelation:

rvl:PropertyMapping
   spin:constraint
      [ a spl:Attribute ;
      rdfs:comment "There has to be exactly one target graphic relation." ;
      spl:maxCount 1 ;
      spl:minCount 1 ;
      spl:predicate rvl:targetGraphicRelation ;
      spl:valueType viso-graphic:GraphicRelation
   ]
# ... analog constraint for rvl:sourceProperty
.

Attributes encapsulate a SPARQL query which can be evaluated to decide whether some property is used as required. The most simple kind of query is a SPARQL ASK query – when it returns “yes” the constraint is violated, when it returns “no” the RVL model meets the constraint. The listing below shows such a constraint that is simply stored as a SPARQL ASK query. For better readability we show the SPARQL query in the usual syntax, not as SPIN, i.e., stored as RDF triples.

rvl:PropertyMapping
   rvl:constraintBasedOnVisoFacts
   [ a sp:Ask ;
     rdfs:comment "Expressiveness - The chosen visual means cannot express the given source property (based on it’s defined scale of measurement)";
     sp:where (
       ASK WHERE {
       ?this rvl:sourceProperty ?sp .
       ?this rvl:targetGraphicRelation ?tvm .
       ?sp viso-data:has_scale_of_measurement/(rdfs:subClassOf)* ?spSoM .
       ?spSoM (rdfs:subClassOf)+ viso-data:Scale_of_Measurement .
       ?data_kind rdfs:subClassOf ?restriction .
       ?restriction owl:onProperty viso-data:has_scale_of_measurement .
       ?restriction owl:allValuesFrom ?spSoM .
       ?tvm viso-facts:not_expresses ?data_kind .
       FILTER (false) .
       }
     )
   ].

An important thing to note is that the last example connects to an external knowledge base, using the viso-facts ontology. This goes beyond the constraints defined above, which only reference RVL concepts.

Summary of Requirements for a Standard Schema Language Deriveable from RVL Use Cases

In order to replace the current OWL+SPIN schema of RVL by a (future) standard “schema language”, this language needs to fullfill the following requirements (in brackets   requirements and use-cases  from the rdf-validation requirements database are listed that match exactly or seem related (~) to the given items (WIP):

  • Class-based restriction of value type and cardinality for properties [R-76, R-75, (opt: R-74), R-17]
  • “Convention over Configuration” –> Definition of default values [R-31, R-38]
  • Constraints may be based on knowledge from various graphs
    • optional: Import of graphs only for the purpose of being used for constraints (complementing owl:imports) [UC-Editor-2]
  • Differentiate between constraint violation levels (e.g. , error / warning / .. ? ) [~UC-3, R-205, ~UC-7,]
  • Support of Guidance / UI Construction [R-195, R-125]
    • could be an additional (low) constraint violation level:  “recommended”. Only values matching on the “recommended” constraint level are suggested to users via UIs. [UC-3, ~R-205, R-72]
    • opt: extensible constraint levels (for example “not effective”,  “not expressive” in the visualization context)
    • opt: Define quick fixes along with the constraints (to avoid hardcoding them in a platform dependent way) [?, R-192]
  • Combinable with open world reasoning –> interpreting OWL (completely) as closed world is not an option. For example, we still want to use standard reasoners to conclude … [R-173]
    • Every rvl:PropertyMapping is an rvl:Mapping
    • Everything having an rvl:sourceValue is an rvl:ValueMapping
  • Consise, compact definition of frequently used constraints [R-184]
    (in the example above we use spin:attribute)
    • –> extensibility and reusability, e.g. by a templating mechanism
  • Document the constraints in natural language [R-192]
  • SPARQL queries can be used to formulate complex constraints [R-188, R-186]
    • optional: Path selector language for convenient, compact selector expressions in simpler constraints [~R-103]
  • Recommend properties from third-party vocabularies to be used, even when not constraining their usage [~UC-Editor-4]
  • optional (may also be an extra language like Forms , or Fresnel could be reused): Suggest visibility and order of widgets for each class [~UC-OER-5]

The above mentioned requirements  can be connected to two main use cases – most of them relate to both:

  1. Support of Guidance / UI Construction
  2. Validation
This entry was posted in Semantic Web & Linked Data. Bookmark the permalink.

Leave a Reply