Brief analysis of Ontolex

These days I’m trying to face one of my unresolved matters: having a fine-grained look at semantic linguistic models. Of course it starts with the W3C Community Group Report Ontolex, and as I go step by step, I focus on the basic module Ontology-lexicon interface (ontolex). All quotes in this text are taken from https://www.w3.org/2016/05/ontolex.

After reading the detailed documentation, including descriptions, examples an diagrams, provided by the community group, I usually try to understand the model in my own way looking at the ontology code. For doing so, I reproduce the ontology code retrieved by the URI http://www.w3.org/ns/lemon/ontolex# on the 25th of March, 2018, by creating a diagram following an adapted UML_Ont profile for ontologies. In this post, I’ll try to show the benefits of combining both approaches, that is, understanding the model from the documentation and from the code, double checking definitions and implementation. It should be mentioned that this analysis is by no mean intended to be exhaustive.

It should be mentioned that for those domains and ranges not defined in the ontology I display them as owl:Thing, so that an instance of any class could be placed. I only found out during this analysis that it is not always the case, as in this particular model some properties aim at linking to classes instead of instances. Anyway, it was interesting taking this decision to analyze the model to see some differences in the properties definition.

The resulting diagram is:

ontolex

While creating the diagram I observed that the inverse properties ontolex:isLexicalizedSenseOf and ontolex:lexicalizedSense are actually defined with the same domain and range. It seems that ontolex:isLexicalizedSenseOf domain and range should be interchanged.

For the case of the property ontolex:isDenotedBy, as I establish empty domains to owl:Thing, its domain does not match, in the diagram, with its inverse range, which is set at rdfs:Resource. Even though it is not a critical issue and there are examples of use in the documentation, it might be a good idea to clarify that, also in the code. According to the documentation provided for ontolex, the expected domain would be rdfs:Resource due to the following explanation:

Note that the target of a denotation does not need to be an individual in the ontology but may also refer to a class, property or datatype property defined by the ontology.

However, one should keep in mind that using, for example, ontology classes as objects of a materialized property, could make the model run into OWL Full, as a URI would act as individual and class at the same time.

Another issue with the domains appears in the property chain: ontolex:sense o ontolex:reference -> ontolex:denotes, as one might expect that the range of the first property is compatible with the domain of the second one in the antecedent.  Such a chain is supposed to be used as documented in section 3.4. However, looking at the diagram extracted from the ontology the domain of the ontolex:reference does not quite match the expected from the examples and core lemon ontolex diagram provided. According to the owl code the domain of ontolex:reference  is the union of ontolex:LexicalEntry  and synsem:OntoMap. In this case, ontolex:LexicalEntry  should be replaces in the domain by ontolex:LexicalSense according to the HTML documentation:

Reference (Object Property)

The reference property relates a lexical sense to an ontological predicate that represents the denotation of the corresponding lexical entry.

Domain: LexicalSense or synsem:OntoMap

Having seen this issues with the domains and ranges I realized I didn’t check what OOPS! could spot (how odd..). Apart from OOPS!’s usual complaints, there is one interesting issue. It is about “P40. Namespace hijacking” (powered by Triple-Checker). In this case the elements in which the pitfall is detected are:

The third case seems to be a false positive, even though TripleChecker finds a difference of 1 character, I can’t actually find it.

Regarding this issue, it is worth noting the following line of the RDF/XML code:

  • Line 620: <owl:Class rdf:about=”&rdf;Resource”/>

The element “Resource” here is defined in the rdf namespace instead of rdfs where it is originally defined as “rdfs:Resource a rdfs:Class“.

Finally, from a user point of view I usually find helpful that the elements intended to be used in the model coming from other vocabularies, for example skos, are included also in the code and the documentation in a consistent way. I mean, skos:definition appears in the HTML documentation but not in the owl code, however, skos:Concept and skos:ConceptSchema do appear in the code as subclasses of them are defined. The property skos:definition  could also be included, and maybe add a local restriction in the class expected to have such attribute, ontolex:LexicalConcept, according to the documentation:

“A definition can be added to a lexical concept as a gloss by using the skos:definition property.”

Acknowledgments: I’d like to thank Julia Bosque Gil first for all the help with linguistic models and for the comments about this post.

Advertisements

Some ideas for ontology graphical representations

After a long time of inactivity, I’ve decided to write this blog post to share some practices I usually follow when generating diagrams to represent ontologies. During the last months I’ve been sharing some materials with different persons and I just thought it would be nice to have one place to gather them, and maybe provide some explanations if that is the case. By no means aims this post to be a good practices guide or reference, in addition it is not even complete, there are issues out of scope as cardinalities and complex axioms.

First of all, I must acknowledge the UML_Ont profile developed within the NeOn project as it has been so far my reference notation for representing ontologies. However, as I’ll explain later, sometimes I adapt the profile as I find more convenience.

It should be mentioned that the original UML_Ont profile utilizes custom stereotypes and dependencies to cover OWL 1 constructs. In this thesis (yes, some of this content comes from my thesis) post, we align the stereotypes used in the profile to OWL and RDF(S) constructs as follows:

UML_Ont profile OWL primitives adaptation
ObjectAllValuesFrom owl:allValuesFrom
ObjectSomeValuesFrom owl:someValuesFrom
ObjectIntersectionOf owl:intersectionOf
ObjectUnionOf owl:unionOf
SubClassOf rdfs:subClassOf
EquivalentClasses owl:equivalentClass
DisjointClasses owl:disjointWith
ObjectPropertyDomain rdfs:domain
DataPropertyDomain rdfs:domain
ObjectPropertyRange rdfs:range
DataPropertyRange rdfs:range
EquivalentObjectProperties owl:equivalentProperty
EquivalentDataProperties owl:equivalentProperty
InverseObjectProperties owl:inverseOf
Transitive owl:TransitiveProperty
Symmetric owl:SymmetricProperty
ClassType rdf:type
not owl:complementOf

Alternatives and additional notations for properties, axioms and individuals are defined in UML_Ont profile.

Based on this UML_Ont profile, here they go some ways of representing classes, object properties, datatype properties and individuals,  as well as some of their axioms or characteristics. For the sake of readability, specific element notations in the following figures are labelled in correspondence to the enumeration items listed below.

Notation for classes, class restrictions and class axioms

  • 1) Classes: the graphical representations for classes, class restrictions and class axioms are depicted in the figure 1. The constructs included in such figure are:
    • 1.a) Named classes are represented by labelled boxes.
    • 1.b) Class restrictions or anonymous classes are represented by empty boxes.
    • 1.c) Universal restrictions are represented by means of the «owl:allValuesFrom» stereotype together with the property on which the restriction is applied.
    • 1.d) Existential restrictions are represented by means of the «owl:someValuesFrom» stereotype together with the property on which the restriction is applied.
    • 1.e) Intersection class descriptions could be represented by means of:
      • 1.e.i) Empty circle together with the «owl:intersectionOf» stereotype.
      • 1.e.ii) Icon including the symbol “⊓”.
    • 1.f) Union class descriptions could be represented by means of:
      • 1.f.i.) Empty circle together with the «owl:unionOf» stereotype.
      • 1.f.ii.) Icon including the symbol “⊔”.
    • 1.g) Subclass of axioms could be represented by means of:
      • 1.g.i) Generalization arrow.
      • 1.g.ii) UML dependency arrow with the «rdfs:subClassOf» stereotype.
    • 1.h) Equivalent class axioms could be represented by means of:
      • 1.h.i) Double-sided UML dependency with the «owl:equivalentClass» stereotype.
      • 1.h.ii) Circle including the symbol “≣”.
    • 1.i) Disjoint class axioms could be represented by means of:
      • 1.i.i) Double-sided UML dependency with the «owl:disjointWith» stereotype.
      • 1.i.ii) Circle including the symbol “⊥”.
notationClass

Figure 1. Notation for classes, class restrictions and class axioms

Notation for properties, relations between properties and property characteristics

  • 2) Properties: the graphical representation for properties, relations between properties and property characteristics are depicted in the next figure The constructs included in such figure are:
    • 2.a) Object properties (relationships): object properties are represented by:
      • 2.a.i) Labelled arrows: in this case the domain and range of the property are indicated by the origin and target of the arrow respectively. The name of the object property is represented by a label close to it.
      • 2.a.ii) Labelled diamonds: in this case the domain and range of the property are indicated by dotted arrows labelled with the «rdfs:domain» and «rdfs:range» stereotypes respectively. The name of the object property is represented by a label within the diamond.
    • 2.b) Datatype properties (attributes): datatype properties are represented by:
      • 2.b.i) Labelled boxes: datatype properties can be represented as labelled boxed attached to boxes representing classes. The range might be included following the character “:” after the datatype label.
      • 2.b.ii) Labelled diamonds: in this case the domain and range of the property are indicated by dotted arrows labelled with the «rdfs:domain» and «rdfs:range» stereotypes respectively. The name of the datatype property is represented by a label within the diamond.
    • 2.c) Equivalent object properties could be represented by means of:
      • 2.c.i) Double-sided UML dependency with the «owl:equivalentProperty» stereotype linking the arrows that represent the involved object properties.
      • 2.c.ii) Double-sided UML dependency with the «owl:equivalentProperty» stereotype linking the diamonds that represent the involved object properties.
    • 2.d) Inverse object properties could be represented by means of:
      • 2.d.i) Double-sided UML dependency with the «owl:inverseOf» stereotype linking the arrows that represent the involved object properties.
      • 2.d.ii) Double-sided UML dependency with the «owl:inverseOf» stereotype linking the diamonds that represent the involved object properties.
    • 2.e) Equivalent datatype properties are represented by a double-sided dependency with the «owl:equivalentProperty» stereotype linking the diamonds that represent the datatype properties.
    • 2.f) Transitive properties are represented by a labelled diamond, which represents the property itself, including the «owl:TransitiveProperty» stereotype.
    • 2.g) Symmetric properties are represented by a labelled diamond, which represents the property itself, including the «owl:SymmetricProperty» stereotype.
notationProp

Figure 2. Notation for properties, relations between properties and property characteristics

Notation for individuals and class membership

  • 3) Individuals: the graphical representation for individuals and class assertions are depicted in the next figure.
    • 3.a) Individuals are represented by labelled boxes with underlined names.
    • 3.b) Class membership:
      • 3.b.i) Labelled box with the individual name followed by the character “:” and the class name, all underlined.
      • 3.b.ii) UML dependency arrow with the stereotype «rdf:type».
notationInd

Figure 3. Notation for individuals and class membership

Figure 4 shows an example of diagram for the VICINITY core ontology. This example includes some of the elements shown and some variations detailed in the following.

CoreOntology

Figure 4. Example diagram for the VICINITY core ontology

Problems, modifications and other ideas…

Even though the UML_Ont profile is quite complete, for OWL 1, some doubts about how to represent specific situations might arise. In other cases some simplifications can be done to have a cleaner diagram. Let’s see some examples:

  • Variation for properties without domain and ranges defined when using the arrow notation (that is 2.a.i and 2.b.i for object properties and datatype properties respectively). Using an arrow with the name of the object property (or a box in the case of datatype properties) usually means that the classes/datatypes attached to such properties are stated as domain or range of the property. However, sometimes that axiom is not explicit in the ontology and we want to represent that a given property is expected to be used between two particular classes in our model or in our dataset. In this case, I alter the arrow or boxes (for datatype properties) representing the property for a dashed one. The meaning of these dashed arrows with the name of an object property or a dashed box for the datatype means that such property can be stated for individuals of the class attached. See figure 4 for examples of this variation. This variation has become handy for me so far, however I haven’t come up with a variation for the cases in which only one of the domain or range is stated for an object property.
  • Simplify «rdf:type» statements: the stereotype could be removed and keep the dependency arrow without label to represent «rdf:type» statements between individuals and classes in order to have a cleaner diagram. See figure 5 for examples of this variation.
    • Sometimes it is also advisable to remove the arrow between the box representing the instance and the class. In this case, when there are too many individuals in a diagram, I place a box representing the class to which the instance belongs to on top of the box of the instance. However, I only do this when the instance belongs only to one class to avoid misunderstandings with a stack of boxes. See figure 5 for examples of this variation.
CERTH_thermometer_02

Figure 5. Example of instances of VICINITY core and WoT ontologies

  • Some times it helps to organized and group ontology elements within a same domain or topic. See figure 5 for an example of an ontology divided by areas. This ontology was built within the Smart Developer Hub project to model the Issue Trackers domain.
ITv10

Figure 6. Issue Tracker ontology.

  • Ontologies and prefixes. I found helpful to use ontology prefixes in each ontology element included in the diagram so that I see at a glance in which ontology (the one being documented or a reused one) is the element define. I sometimes also use different colours for classes defined in different ontologies, however the colour does not replace the use of a prefix. I also include the ontologies being referenced or imported and their prefixes in the legend. See figure 4 for an example.
  • Finally, theses days we came up with a “map” of  an ontology network in which only the main classes in each ontology are represented and they are linked by lines meaning that the are related, somehow. Main hierarchical relations are also included. This relationships are detailed in each ontology diagram. See figure 7 for the example of the “map” of the VICINITY ontologies. The first idea of generating this map was only for internal management but it resulted so useful to have a quick overview of the network and it is now part of the network documentation.
OntologyNetwork

Figure 7. VICINITY ontology network map

Conclusions

In summary what I find it is useful for the others, who needs to understand our models, and what I try to follow is:

  1. Be consistent with the selected notation (of course within a document but also across projects when possible)
  2. Include a legend, including all symbols used
  3. Do not use the same symbol (without distinctions) for more than one meaning
  4. (Try to) keep it simple

I hope you find something useful on this post. If you have any suggestion, comment or contribution it will be more than welcome.