---
jupytext:
  formats: md:myst
  text_representation:
    extension: .md
    format_name: myst
    format_version: 0.13
    jupytext_version: 1.16.4
kernelspec:
  display_name: Python 3 (ipykernel)
  language: python
  name: python3
---

# DCMI OpenWEMI

This text: 2024-12-18
* first version: 2024-06-25
* a litte update 2024-07-02 at the very end of the text

Wonderful news (2023-11-29): *DCMI goes WEMI!*

* <https://www.dublincore.org/news/2023/11-29_openwemi-community-review/>
* Coyle, Karen. Works, Expressions, Manifestations, Items: An Ontology. Code4lib Journal, Issue 53, 2022-05-09. <https://journal.code4lib.org/articles/16491> {cite}`coyle2022-wemiOntology`


```{figure} ../../images/github.com_dcmi_openwemi_2024-06-21.png
Source: <https://github.com/dcmi/openwemi/tree/main>
```

+++

## Discussion

### Multiple rdfs:range e.g. for openwemi:instantiates?

<!-- Die Semantik eines einzelnen offenen Pfeiles wird in <https://github.com/dcmi/openwemi/> mit *domain* und *range* angegeben, und ind der Turtle-Datei mit `rdfs:domain` und `rdfs:range` konkretisiert. Das ist kein Problem, wenn pro Property nicht mehr als ein Domain oder Range angegeben wird. 

Frage: Ist es weise, für eine Property wie z.B. `openwemi:instantiates` in der informellen Abbildung *mehrere* Pfeile anzugeben? Was ist die konkrete Semantik von OR in der normalsprachlichen Erklärung, und wie wird das in RDF(S) umgesetzt?

Wir schauen im Turtle-Code nach.
-->

The semantics of a single open arrow is specified in <https://github.com/dcmi/openwemi/> in normal language with *domain* and *range*, and concretized in the turtle file with `rdfs:domain` and `rdfs:range`. This is not a problem if no more than one domain or range is specified per property. 

Question: Is it wise to specify *several* arrows for a property such as `openwemi:instantiates` in the informal mapping? What is the concrete semantics of OR in the normal language explanation, and how is this implemented in RDF(S)?

Let's take a look at the ttl code.

* Backlink: <https://github.com/dcmi/openwemi/blob/main/docs/ns/openWEMI.ttl>
* File: [openWEMI.ttl](https://raw.githubusercontent.com/dcmi/openwemi/main/docs/ns/openWEMI.ttl)

+++

```
openwemi:instantiates
  a rdf:Property ;
  rdfs:label "instantiates"@en ;
  rdfs:comment "An Endeavor that instantiates a Manifestation, an Expression or a Work."@en ;
  rdfs:isDefinedBy openwemi: ;
  rdfs:subPropertyOf dct:relation ;
  rdfs:domain openwemi:Item ;
  dct:description "A relationship asserted from an Item to a Manifestation, an Expression, or a Work."@en ;
  rdfs:range [
    a owl:Class ;
    owl:unionOf (
      openwemi:Work
      openwemi:Expression
      openwemi:Manifestation
    )
  ] .
```

+++

The `rdfs:comment` for the property `openwemi:instantiates` explains that an `openwemi:Item` openwemi-instantiates "a manifestation, an expression or a work". (A layman would read XOR here, but as logicians we of course read a non-exclusive OR). Together with the semantics of `rdfs:range`, the `rdfs:comment` indicates that *one or more classes from* `openwemi:Work`, `openwemi:Expression` or `openwemi:Manifestation` are openwemi-instantiated. -- Explicitly: "one or more classes from ..." here means more exactly "one or more elements out of the class, which consists of the classes ...".

However, something else is actually specified in the ttl file: The `rdfs:range` of `openwemi:instantiates` is specified as an anonymous class *consisting of the union of* `openwemi:Work`, `openwemi:Expression` and `openwemi:Manifestation`. --  Explicitly: "consistsing of the union of ..." here means "all elements which are contained in at least one of the the classes ... ".

+++

The consequence: `rdfs:range` can *no longer* be used to derive a specific class from `openwemi:Work`, `openwemi:Expression` and `openwemi:Manifestation`.

Say we have the triple `:myItem_123 openwemi:instantiates :myManifestation_123`:

* Using the above *domain* information, it is in fact possible to infer that `:myItem_123 a openwemi:Item`;
* however, using the above *range* information, it is *not* possible to infer that `:myItem_123 a openwemi:Manifestation`.

+++

## NEW 2024-12-18

The above explanation is not clear enough, we'll try again with a minimalistic sandbox example.

see {doc}`rdfs-range-bunny-carrot`.

+++

### Possible solutions

(1) good solution, mentioned in <https://github.com/dcmi/openwemi/issues/43>: Do it like FaBiO ... and discuss the term "disadvantage":

> The obvious disadvantages are: \[...\] people need to think about when creating their metadata ("is this an item of a manifestation or an item of an expression?" (<https://github.com/dcmi/openwemi/issues/43>)

IMHO this is an advantage

(philosophical discussion: IMHO the WEMI classes are strongly disjoint, from an ontological point of view. Thus an item may only be connected to a manifestation, but not to an expression or even work. *item instantiates work* does not fit to the original FRBR WEMI and not to RDA and CIDOC. But that's not the point of discussion here.)

+++

(2)

> what if the object is a combination work/expression à la Bibframe?

If we go to WEMI we anyhow need to disambiguate work/expression-combinations into two different nodes. Long discussion: <https://www.jbusse.de/lovs/semantische-dekomposition-konglomerat.html#anwendung-buch-nach-wemi-dcat-nach-wemi>, <https://www.jbusse.de/logd/dcat2frbr>  (in DE, but machine translation may help?).

+++

(3)

Another solution: Create OpenWEMI as a radical prune of CIDOC. RDA was merged with CIDOC, see <https://www.cidoc-crm.org/frbroo/fm_releases> > <https://www.cidoc-crm.org/frbroo/sites/default/files/LRMoo_V1.0.pdf>; there see Figure: [4.2. Overview of the Model: Illustration 1, page 6](../../images/LRMoo_V1.0_illustration1.png) in [LRMoo_V1.0.pdf](https://www.cidoc-crm.org/frbroo/sites/default/files/LRMoo_V1.0.pdf)

<!--
```{figure} ../../images/LRMoo_V1.0_illustration1.png
Source: <https://www.cidoc-crm.org/frbroo/sites/default/files/LRMoo_V1.0.pdf>, p.6
```
-->

<!-- Wäre es nicht schön, OpenWEMI mit CIDOC LRMoo wenigstens grundsätzlich zu alignen? Jedenfalls sollte man vermeiden Axiome einzuführen, die  nicht mit CIDOC kompatibel sind. Manche halten CIDOC und LRMoo für "zu fett": Wenn es Anliegen von OpenWEMI ist, eine leichtgewichtige Alternative bereitzustellen, dann könnte man OpenWEMI als eine minimalste Auswahl von CIDOC anlegen? -->

Wouldn't it be nice to align OpenWEMI with CIDOC LRMoo, at least in principle? In any case, one should avoid introducing axioms that are not compatible with CIDOC. Some (including me) consider CIDOC and LRMoo to be "rather fat": If OpenWEMI is intended to provide a lightweight alternative, then OpenWEMI could be created as a minimal lightweight version of the respective CIDOC classes?

+++

(4)

IMHO best solution: Do not model domain and range at all. IMHO best solution: Do not model domain and range at all, at least not with rdfs:domain and rdfs:range.

This is because these language elements from RDFS (and OWL) have semantics that are misunderstood by many people. NEW 2024-07-02: some explanations see below, {ref}`rdfs-entailment-not-validation-or-constraint`.

+++

## experiment with rdflib and owlrl

We show the domain and range inferencing with a minimalistic example, based on the well known Python libraries 
* <https://rdflib.readthedocs.io/en/stable/>
* <https://pypi.org/project/owlrl/>

```{code-cell} ipython3
import rdflib
import owlrl
```

Exampe from above, plus one triple of example instances:

```{code-cell} ipython3
wemi_ttl = """
@prefix : <http://example.org/ns#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix openwemi: <https://dcmi.github.io/openwemi/ns#> .

openwemi:instantiates
  a rdf:Property ;
  rdfs:domain openwemi:Item ;
  rdfs:range [
    a owl:Class ;
    owl:unionOf (
      openwemi:Work
      openwemi:Expression
      openwemi:Manifestation
    )
  ] .

:myItem_123
   a <urn:someSubjectClass> .
<urn:someSubjectClass> a owl:Class .

:myManifestation_123 
    a <urn:someObjectClass> .
<urn:someObjectClass> a owl:Class .

:myItem_123
    openwemi:instantiates :myManifestation_123 .
"""
```

Graph `g1` is the graph before inferencing:

```{code-cell} ipython3
g1 = rdflib.Graph().parse(data= wemi_ttl)
print(f"Initially g1 has {len(g1)} triples")
```

```{code-cell} ipython3
print(g1.serialize())
```

We also allocate `g2`. It will be modified by `owlrl.DeductiveClosure()`.

```{code-cell} ipython3
g2 = rdflib.Graph().parse(data= wemi_ttl)
print(f"Initially g2 has {len(g2)} triples")
```

```{code-cell} ipython3
owlrl.DeductiveClosure(owlrl.OWLRL_Semantics,
    axiomatic_triples = False).expand(g2)
print(f"After inferencing g2 has {len(g2)} triples")
```

```{code-cell} ipython3
g2_ttl = g2.serialize(format="ttl")
# print(g2_ttl)
```

### What's the domain of our exemplar?

```{code-cell} ipython3
q_domain = """
PREFIX : <http://example.org/ns#> 
PREFIX openwemi: <https://dcmi.github.io/openwemi/ns#> 
PREFIX owl: <http://www.w3.org/2002/07/owl#> 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 

SELECT ?S ?SClass
WHERE {
  ?S a ?SClass .
  ?S openwemi:instantiates ?O .
  }
"""
```

Graph `g1` reflects the situation before inferencing:

```{code-cell} ipython3
for row in g1.query(q_domain):
    print(row)
```

Graph `g2` reflects the situation *after* inferencing:

```{code-cell} ipython3
for row in g2.query(q_domain):
    print(row)
```

As we can see, our example item `:myItem_123'` is now an instance of `openwemi:Item`.

+++

### What's the range of our exemplar?

```{code-cell} ipython3
q_range = """
PREFIX : <http://example.org/ns#> 
PREFIX openwemi: <https://dcmi.github.io/openwemi/ns#> 
PREFIX owl: <http://www.w3.org/2002/07/owl#> 
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> 
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> 
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#> 

SELECT ?O ?OClass
WHERE {
  ?O a ?OClass .
  ?S openwemi:instantiates ?O .
  }
"""
```

Graph `g1` reflects the situation before inferencing:

```{code-cell} ipython3
for row in g1.query(q_range):
    print(row)
```

Graph `g2` reflects the situation *after* inferencing:

```{code-cell} ipython3
for row in g2.query(q_range):
    print(row)
```

As we can see, our example manifestation `:myManifestation_123'` is now NOT instance of `openwemi:Manifestation`. Instead it is an instance of a BNODE with an internal ID.

The full set of triples of `g2` after inferencing can be seen here:

```{code-cell} ipython3
print(g2.serialize())
```

(rdfs-entailment-not-validation-or-constraint)=
## Semantics of rdfs: Entailment, not validation or constraint

update 2024-07-02

https://github.com/dcmi/openwemi/issues/94 states:

> We want to make that a validation point ...

> it should be possible to detect that as an inconsistency ...

(1)
There is a misunderstanding that is as common as it is profound about RDF(S): rdfs:domain and rdfs:range definitely do *not* help you to validate an RDF graph or detect inconsistencies in an RDF graph.

https://lists.w3.org/Archives/Public/semantic-web//2006May/0118.html cites a DCMI text:

> > 6. Using domains and ranges: RDF supports using "domain" and "range" constraints on RDF properties, for limiting the kinds of resources that a property apply to, ... 

> I would think that the above paragraph reveals a deep  misunderstanding about the nature of rdfs:range and rdfs:domain ... correct?

Correct! The semantics of rdfs:domain (and range) are given in https://www.w3.org/TR/rdf11-mt/#rdfs-entailment, rdfs2 and rdfs3. It's clearly stated there that you can *entail* new relationships between two nodes based on the domain information if they don't exist anyway. (for a more detailes explanation c.f. <https://lists.w3.org/Archives/Public/semantic-web//2006May/0121.html>)

(2)
However, there seems to be a backdoor to use rdfs:damain and rdfs:range as constraints, see https://www.w3.org/TR/rdf-schema/#ch_domainrange :

> RDF Schema provides a mechanism for describing this information, but does not say whether or how an application should use it. For example, while an RDF vocabulary can assert that an author property is used to indicate resources that are instances of the class Person, it does not say whether or how an application should act in processing that range information. Different applications will use this information in different ways. For example, data checking tools might use this to help discover errors in some data set, an interactive editor might suggest appropriate values, and a reasoning application might use it to infer additional information from instance data.

I don't think that a ttl model should use this backdoor. If you want to provide a validator that explicitly does not want to use RDFS semantics with information about permitted domains and ranges, https://schema.org/rangeIncludes and https://schema.org/domainIncludes would be more appropriate. Or obe could use SHACL, but that's also a rather complex technology.