We have just released a new version of UMBEL (v 071). This new version is based on a new version of OpenCyc that has been updated with the latest knowledge base version 5014. This is the latest version of OpenCyc they released after we met Cycorp and the Cyc Foundation a couple of weeks ago in Austin. In the meantime we also fixed some things and enhanced the UMBEL concept structure.
Here is the list of changes and fix:
- The UMBEL subject and abstract concept structure is based on OpenCyc kb5014
- The UMBEL namespaces changed
- UMBEL subject concepts now link to OpenCyc classes and individuals
- The UMBEL generation scripts now uses the OpenCyc external IDs
- Duplicated lines in the file umbel_cytoscape_vXYZ.csv have been removed
- The linkage of BIBO to UMBEL has been completed
- The linkage of FOAF and SIOC to UMBEL has been revised
- The encoding of the character “%” in the named entities dictionaries N3 files has been fixed
- The UMBEL technical documentation has been updated according to this list of changes.
Now lets talk about some major changes of this new release.
New UMBEL namespaces
We changed the UMBEL namespace URIs to be more consistent moving forward. Here is the fuller rationale:
“Here are the URIs of the namespaces used to describe the UMBEL Ontology, the subject concepts structure, the named entities defined in UMBEL and the semsets for both the subject concept classes and named entities.
The folder structure of these classes of URIs has been generalized to meet the design goals of using UMBEL with domain extensions. The portion “/umbel/” in the URIs is a placeholder for the name of these extensions. Each extension, including UMBEL itself, will share the same identification structure. An example for a ‘Foo’ domain ontology at an alternative example.com domain using the “/foo/” folder extension is shown in the table below.
The UMBEL Ontology vocabulary URI uses a “hash URI” for convenience purposes. This facilitates the retrieval of the document of the descriptions of the vocabulary for tools that consume such documents. However considering the size of the subject and abstract concepts descriptions files, the named entities and semset files, we choose to use “slash URIs” so that consumer tools do not have to download the description of all subject and abstract concepts, named entities and semsets descriptions when they request the description of one of these resources.”
The new namespaces are defined as:
Name |
Abbreviation |
URI |
UMBEL Ontology |
umbel: |
http://umbel.org/umbel# |
Subject Concepts |
sc: |
http://umbel.org/umbel/sc/ |
Abstract Concepts |
ac: |
http://umbel.org/umbel/ac/ |
Named Entities |
ne: |
http://umbel.org/umbel/ne/ |
Semsets |
semset-xyz |
http://umbel.org/umbel/semset/xyz/ |
Example, English semset |
semset-en |
http://umbel.org/umbel/semset/en/ |
FOO Ontology (a domain ontology based on UMBEL) |
foo: |
http://example.com/foo# |
We now consider these new URIs as “frozen”. So please update your application with these new URIs.
UMBEL subject concepts that link to classes and individuals
In some edge cases, UMBEL considers that an OpenCyc individual is a subject concept or an abstract concept. This means that not only OpenCyc classes can be selected to be UMBEL subject concepts, but OpenCyc individuals can be as well. The definitions of UMBEL subject concepts, abstract concepts and named entities guide how the corresponding OpenCyc collection (“class”) or individual is treated. If an UMBEL subject concept is related to a OpenCyc collection (“class”), then the linkage between these two resources will be done with the property owl:equivalentClass. If an UMBEL subject concept is related to a OpenCyc individual, then the linkage between these two resources will be done with the property owl:sameAs. Check the volume 2 to know what we consider as subject concept, abstract concepts and named entities.
Use of OpenCyc classes’ external IDs
UMBEL subject and abstract concepts names have been used for convenience only. When a new version of UMBEL is created, the “external IDs” of the OpenCyc classes are used to link these classes to UMBEL subject and abstract concepts. That way, if their naming conventions change from an OpenCyc version A to a version B, then we are still able to update the proper UMBEL concepts according to their new OpenCyc definitions. Note that the OpenCyc external IDs are only used when we create a new version of UMBEL. Otherwise the URIs of the UMBEL subject and abstract concepts use the “human readable” labels to refer to the concepts.
Linkage between OpenCyc and UMBEL
We have to note that OpenCyc added linkage from the OpenCyc classes to the UMBEL subject concepts classes. This means that if someone dereferences OpenCyc classes URIs, they will have a reference to UMBEL subject concept classes via the property owl:sameAs.
Still to come
While much progress has been made in this new version 071, there are some pending issues and tasks not in the current release:
- Complete Web service and endpoints release (forthcoming in a few days)
- Re-inclusion of company provinces, states and territories
- Automatic instance checks to ensure better coverage of more specific concepts in the ontology.
We are continuing to work out test and automation procedures with Cycorp and will incorporate these improvements as well in subsequent releases.
Conclusion
This new release is one more step in the good direction. UMBEL is getting more and more stable. Its relation to OpenCyc is stronger and stronger. And its linakge to external ontologies is bigger and bigger. Please report any issues, comments or suggestions on the mailing list.