Archive for the 'UMBEL' Category

Volkswagen’s Use of structWSF in their Semantic Web Platform

TribalDDB London, Volkswagen UK‘s partner, mentioned earlier this week that Volkswagen are using some parts of the Open Semantic Framework to develop the next generation of their online platform.

This story has been published by Jennifer Zaino’s in her article: Volkswagen: Das Auto Company is Das Semantic Web Company!

I can now talk about this project that uses some pieces of the framework that we have been developing for more than 3 years now.

The Objective

Volkswagen’s main objective behind the development of the next version of their Web platform started by improving their online search engine, but as William Greenly mentioned, it quickly became a strategic decision:

"So the objectives were about site search and improving it, but in the long-run it was always the idea to contextualize content, to facet content, to promote it in different contexts."

The objective is to create a platform that gives them the flexibility to leverage all the data assets they own. This flexibility will help them to leverage the data assests they have to improve not only their search engine, but also to contextualize it in different parts of their websites, partner’s websites or to promote, and publish that same information on different communication channels or devices.

The Flexibility


What is a flexible platform in that context? A flexible platform is one that can integrate any kind of information sources. Such information sources in the context of Volkswagen can be a series of relational dataset schemas spread around the World, Excel spreadsheets, CSV files, old plain text technical documents about past model of cars, semi-structured documents such as webpages, etc.

A flexible platform is also one that minimally impact (if at all) the data consumers if the data structure changes in the system. This is really important since the World we live in constantly changes. This means that things constantly change and we have to reflect these changes in the data we own and maintain. This is why this point is so important, because we want to minimize the impact of the data structure changes that will happen all the time.

Having the flexibility to constantly adapt your data, while minimally impacting the data consumers of the system, enables you to make quick decision to adapt your strategy in a highly competitive World. This flexibility gives you a clear business advantage.

A flexible platform is also one that let you publish your data the way you want, in the format that is needed. Such a flexible platform has to give you access to an interface that give you access to all the functionalities of the platform without having to care about what happens under the hood.

A flexible system is one that can communicate your information on any kind of communication channels, and to any devices that have access to the Web.

Under the Hood

That next generation platform that Volkswagen is currently developing is partly based on a few of the main pieces of the Open Semantic Framework. These pieces help them to reach their goal by helping them giving the flexibility their platform needs.

The first step they gone thru was to create their Volkswagen Vehicles Ontology that is used to describe all the entities they want to index into their platform. The Web Ontology Language (OWL), along with the Resource Description Framework (RDF) is what gives them the complete flexibility on how they can integrate all the pieces of information they want, in a canonical format.

Then they choose to use structWSF (the structured data web services framework). This piece gives them the flexibility to get a series of web interfaces (web service endpoints) to create, update, manage and query their data. This web service layer enables them to do anything they want with their data, from anywhere on the Web. This is possible because all the functionalities of the framework are exposed as web service endpoints. StructWSF also gives them the possibility to communicate their data in multiple different formats. This makes it the perfect flexible system to feed their information in different contexts, in different communication channels or on different devices.

At Volkswagen, structWSF is used to populate, and keep in sync, their Solr and Triple Store instances. It gives them the time to care about the more important aspects of their platform, and to care about how the data should be synced between the various specialized data management systems.

By using structWSF to manage their data, they are able to reach some objectives to make their platform as flexible as possible:

  • To be able to minimize the impact of data changes to the data consumers
    • Because structWSF uses OWL & RDF to describe all the data it index
  • To be able to manipulate their data from anywhere
    • Because all the functionalities of structWSF are exposed as web service endpoints
  • To be able to communicate the information in different contexts, communication channels and devices
    • Because structWSF has, in its core, is designed to transform all the data it indexes in any other kind of format

The Next Step

One of their longer term goal and objective is to analyze their unstructured and semi-structured textual documents to extract some structure out of them, and to index them into their semantic platform. To do this, they are looking at using Scones, which is the structWSF semantic tagger web service endpoint. Scones will use some subject reference structures such as UMBEL to semantically tag the textual document. Once the document as been processed by Scones, and indexed in structWSF, it can now be re-published in different contexts based on the reference concepts that have been tagged to it. This gives them the flexibility to leverage non-structured sources of data and to re-purpose it in different ways by publishing it in different context and in different systems.

This second system will enable them to leverage the investment they made in the past, by writing all these textual documents, and to re-purpose, and re-contextualizing, them in all kind of different contexts.

Conclusion

I think that TribalDDB and Volkswagen make the good decision for their future. Taking the business decision to develop and maintain a completely new kind of information system is not an easy decision to take. I am not saying that they made the good choice to use our pieces of the stack. The decision goes far beyond this. Such a Semantic Platform challenges everything in an organization: the people that takes the decisions, the people that create and manage the data, the people that develop the system, the people that maintain that system, the consumers of the system, the customers, the partners, etc. This is a big decision; whatever the technology stack you plan to use. I congratulate them for the decision they took.

I strongly believe that this was the right decision to take considering the future opportunities they are creating to themselves.

 

 

UMBEL Blooms with New Colors

We are happy to announce the new, intermediary, UMBEL version 0.80. This is a major upgrade of the UMBEL ontology: both its vocabulary and its reference structure have been greatly enhanced, an upper structure called the SuperTypes has been added and everything got updated to OWL 2. You can read more about the overall changes on Mike’s blog post.

In this blog post I will focus on two topics: using some existing tools and frameworks to view and manage the reference concepts structure, and how one can use and leverage the coherency of the reference structure.

Navigating and Updating the Reference Structure

One thing that was lacking with the previous version of UMBEL was to have access to a user interface tool that would let you navigate and update the reference structure as you want. Because of the way the conceptual structure was created, it was hard for tools such as Protégé to load it because of all the individuals that were created (such as the SemSet individuals, etc.).

As stated in Mike’s blog post, we made significant changes to the UMBEL vocabulary, and how we instantiate the reference structure. Along with the OWL 2 upgrade, we made sure that the Protégé version 4.1 and the latest version of the OWLAPI could easily load both the UMBEL vocabulary and the reference structure.

Reasoning

One of the major additions to UMBEL v080 is the SuperTypes upper structure, an organizational layer above the UMBEL reference structure. We created these SuperTypes because we found that we could effectively cluster most UMBEL reference concepts into a small set of mostly distinct upper concepts (33 in fact, 29 of which are designed as disjoint).

This new SuperTypes structure helps us mine external sources of information by leveraging related concepts in the reference structure. Moreover, SuperTypes also help us perform easier, simpler, better and faster reasoning over the entire 21 K reference concepts structure.

Thus, SuperTypes provide a new tool to help determine if the UMBEL reference structure is consistent and coherent within itself. This is important, of course, to ensure that linkages between UMBEL and external ontologies is consistent and coherent as well.

So far, the entire reference concepts structure has been tested for its coherency according to the restrictions we defined at the level of the SuperTypes upper structure. Using different reasoners such as Pellet, Fact++ and Hermit (available by default with Protégé 4.1), we made sure that all the statements made between all the RefConcept classes and individuals, and all the statements made between these and the SuperTypes upper structure, are consistent within themselves. This method enabled us to find and fix some early assignment issues.

This new upper structure, along with its now consistent reference structure, helps provide confidence that statements based on UMBEL reference concepts are also consistent. And, all of this is made more testable by virtue of being able to use the OWL API and Protégé with its embedded reasoners.


How is Coherency Tested?

This is the core question. In fact, the more informative answer to this question will be part of a forthcoming blog post. But let’s start here.

The current way to check if the structure is coherent is by making sure that we don’t have an individual that belongs to two different SuperTypes that are stated to be disjoint. What we did with the SuperType upper structure is really simple: we categorized each and every RefConcept (using rdfs:subClassOf) under a SuperType. Most of the SuperTypes are disjoint: this means that if an individual is of rdf:type for two SuperTypes that are stated to be disjoint, then you will end-up with an incoherent structure because you are making a statement that is not permitted by the reference structure.

So, the way to check if your statements are coherent according to this structure, is to make your statements (right now, in terms of individual instantiation), and then to check using a reasoner such as Pellet. There is now a general testing structure to see if any ontology is coherent with respect to the UMBEL reference structure.

In the next blog post in this series, I will tell you how to use exactly the same method for coherency testing, but now for testing if linkages between external ontologies and the UMBEL reference structure are consistent. In that case, you will make the class-to-class assertions you want, and then you will instantiate individuals of these classes, then run the reasoner. Then, the reasoner will tell you if your ontology is still consistent according to the structure and the new statements you created.

Next Step

In parallel with these tutorials, we are also working hard on the next version of UMBEL. As outlined in the Next Changes section of the new UMBEL website, the next step is to release UMBEL v1.0, with a set of new features, before Christmas.

A New Home for UMBEL Web Services

umbel_wsEight months ago we announced the dissolution of Zitgist LLC. This event led to the creation of a sandbox to keep alive all the online assets of the company. Since this sandbox server was not owned by Structured Dynamics, it was becoming hard for us to update UMBEL and its online services. It is why we took the time to move the services back on to our new servers.

A New Home

sd_logo_260Structured Dynamics LLC now hosts a new version for the UMBEL Web services. From the main menu at the SD Web site you can access these services under the “umbel ws” menu option (you can also bookmark the Web services site at umbel.structureddynamics.com or ws.umbel.org.)

This move of UMBEL’s Web services to a new home will make the future upgrade of UMBEL easier, and this will make the maintenance of the Web services endpoints easier as well. With this move, I am pleased to announce the release of five initial Web services and one visualization tool:

Lookup Web Services:

Inference Engine Web Services:

SPARQL endpoint Web Service:

Visual Tool:

Note that the visual tool is using Moritz Stefaner’s Relation Browser.


Ping the Semantic Web

ptswlogo160.gifAdditionally, the Ping the Semantic Web RDF pinging service is now the property of OpenLink Software Inc. OpenLink is now hosting, maintaining and developing the service.

New release of UMBEL: v072

umbel_medium.pngI am pleased to announce that we resumed our work with UMBEL. We just released the version v0.72, which is based on the OpenCyc version 2009-01-31. This new version is intermediary and has been created mostly to check the evolution of OpenCyc vis-à-vis UMBEL. Within the next month or so, we will release a new version (v.080), which will introduce a major new concept that should help systems and users manipulating the entire UMBEL Subject Concepts structure.

For them who want to know what changed between versions v071 and v072, here is CVS file that list all the changes between the versions. There are four columns: (1) source node, (2) attribute, (3) target node and (4) version number. This file list all triples that are present in a version, but not in the other. So, you have all changes (nodes & arcs) between the two versions. Mostly all the changes come from internal changes to OpenCyc. We did fix a couple of things such as removing cycles in the graph, etc. But 99% of the changes come from changes within OpenCyc.

Finally note that the web services endpoints will be updated with this new version of UMBEL subject concepts in the coming week along with the dereferencing of their URIs. Stay tuned!

UMBEL Web Services Endpoints Released

After some delay, we are pleased to finally release the UMBEL Web services endpoints to the public. We have re-organized the Web services we introduced three months ago to add coherency and flexibility to the model.

The goal remains the same, but with a different flavor: these tools let ontologists and Web developers search, discover and use the UMBEL subject concept and named entity structures. The added flavor is that these Web services now fully embrace the HTTP 1.1 protocol and are provided via a series of well established data and serialization formats.

We now have RESTful Web services to add to our RESTful linked data. Pretty cool combination!

We are introducing two kinds of Web services: (1) atomic Web services and (2) compound Web services. An atomic Web service only performs one action: It takes some inputs and then outputs a resultset of the action. A compound Web service takes multiple atomic Web services, plugs them together in a pipeline model, and then takes some inputs and outputs a resultset arising from the compound action.

The communication between each of these Web service instances and the external World is the same: communication is governed by the HTTP 1.1 protocol. HTTP is generally RESTful and used to establish the communication, to determine mime type and serialization, to get inputs, to return status of the communication and possible errors, and to send back the resultset of the computation of the Web service.

That way, we can easily, within hours, programmatically pipeline these atomic Web services together to create new Web services. We can integrate external Web services endpoints into the same pipeline without modifying anything to the architecture. Status, errors and resultsets are propagated along the line, directly to the data consumer. This is the flexibility part of the story.

Now, how cool is that?

Overview of the UMBEL Web Services Endpoints

We are today releasing a couple of these atomic and compound Web service endpoints to the public, but others will follow in the coming weeks and months. Four families of Web services have been released that total seven Web service endpoints:

If you don’t know what UMBEL is, I would suggest you read a background information page that talks about the project.

The most important reading related to this blog post is the API philosophy documentation page that talks about the details of the design of this Web services architecture.

For Web developers that want to integrate these Web services endpoints within their application, an API documentation page explains how to communicate with these endpoints for each of the services.

Example of an Atomic Web Service

The Inference: Lister Web service is a good example of an atomic Web service. It takes a subject concept URI as the input and outputs a series of super-class-of, sub-class-of or equivalent-class-of classes for that concept. As an atomic service it does one thing and one thing only: Inferring relationships of a given subject concept URI.

Example of a Compound Web Service

The Reporter: Named Entity Web service is a good example of a compound Web service. This Web service displays full of information about a UMBEL named entity URI. However, not all the information returned by this Web service is directly computed by it. In fact, the information about broader and equivalent classes and subject concepts come from the Inference: Lister Web service. Results coming from this Web service are immediately integrated in the Reporter’s resultset. This is easily done considering that they share the same communication language (HTTP 1.1) and the same data and serialization formats (XML, RDF+XML and RDF+N3). This flexibility is priceless to quickly create resourceful compound Web services.

Conclusion

After some months to get the design right, we have finally released some of the UMBEL Web services to the public. These Web services can easily be integrated in current software architectures to leverage UMBEL’s vision of the World. The architecture underlying what we have released today will help to easily integrate UMBEL’s principles and concepts within new and existing projects. This will ultimately help people to quickly react to the changing World of needs and expectations of data users and consumers.

I hope you will enjoy using these new Web services, which Zitgist is freely hosting. The data you get from the Web service is open data and can be used freely with attribution.

Please do report any issues you may encounter. We also welcome any advice or suggestions that you would care to provide to enhance the overall system.

Exploding DBpedia’s Domain using UMBEL

A couple of challenges I have found with DBpedia is that it is hard for a system to interact with the dataset and it is hard to figure out how to interpret information instantiated in it. It is hard to know what properties are used to describe individuals; and hard to know what the classes refer to. It is also hard for standalone and agent software to understand the nature of the individuals that are instantiated by DBpedia because the classes they belong to are generally unknown or poorly defined.

In the following blog post I suggest to use a method known as “exploding the domain” to try to overcome these difficulties of using and understanding DBpedia. This adds still further usefulness to DBpedia’s considerable value. This demonstration is based on the UMBEL subject concept structure.

As I will demonstrate below, this method consists of contextualizing classes in a coherent framework to explode their domains. By exploding the domain of a class, we link it to other classes that are defined by external ontologies. By exploding the domain of a class by linking it to externally defined classes, we also help standalone and agent software to understand the meaning for that class (at least if they understand the meaning of the classes that have been linked to it). Note that we are able to explode the domains by linking classes using only three properties: rdfs:subClassOf, owl:equivalentClass and umbel:isAligned.

First of all, let me give some background information about how DBpedia individuals and UMBEL named entities have been created, and how both datasets have been linked together.

How DBpedia individuals are instantiated

DBpedia is a dataset that is based on the well known Wikipedia encyclopedia. Basically DBpedia creates one individual for each Wikipedia page. Most of the individuals that are instantiated in this way are what we call a “named entity” in UMBEL’s parlance.

But to be instantiated, an individual has to belong to a class. DBpedia chooses to use Yago‘s classification system (that is based on WordNet) to instantiate those DBpedia individuals. This means that all DBpedia individuals belong to at least (theoretically) one Yago class. This means that all DBpedia individuals are instances of Yago classes (and in some rarer cases, they are also instances of classes defined in external ontologies).

How UMBEL named entities have been created

For its part, UMBEL’s named entities dictionaries come from different data sources. Currently, most all public UMBEL named entities also come from Yago (example: Aristotle), but many also come from the DBTune dataset (example: Pete Baron) or others. (UMBEL’s design allows more named entities to be plugged into the system as additional dictionaries at will.)

However, unlike DBpedia, we do not use Yago’s classification system to instantiate these named entities. And unlike Yago, we do not use the WordNet classes to instantiate the named entities either.

The current UMBEL subject concept structure is based on OpenCyc. This means that the relations between the classes that instantiate the UMBEL named entities come from the Cyc knowledge base.

So while we use Yago’s named entities (from Wikipedia) as a starting basis, we instantiate them using the UMBEL subject concept classes instead of the WordNet classes. So, basically, we have switched the WordNet conceptual framework for the UMBEL (or OpenCyc) one.

But, how did we create these UMBEL named entities, instantiated using UMBEL subject concept classes and based on Yago? Here is the linkage path:

Yago classes –> WordNet synsets <– Cyc collections <– OpenCyc classes <– UMBEL subject concept classes

Et voilà !

How UMBEL named entities are linked to DBpedia individuals

OK, so now how do we link UMBEL named entities to DBpedia individuals? It is simple. Remember that DBpedia individuals have been created from Wikipedia pages. Also remember that Yago individuals come from the same Wikipedia pages. We can then make the link between the individuals from DBpedia and the individuals from Yago based on Wikipedia URLs.

Exactly the same logic applies for linking DBpedia individuals to UMBEL named entities.

The end result of this linkage is that we have UMBEL named entities that are the same as DBpedia individuals. The difference is that the UMBEL named entities are now instances of UMBEL subject concepts: a totally different conceptual structure.

Remember that these named entities are contextualized in a coherent conceptual framework. And this characteristic means a lot for what is yet to come.

Web services to search and visualize these named entities

We created two new web services on the UMBEL web services home page (the user interface to these web services; the endpoints will be released later) to help people interact with these named entities:

  1. The “Search Named Entities Dictionaries” web service
  2. The “Named Entity Detailed Report” web service

The first web service lets you search amongst all publicly available UMBEL named entities dictionaries.

The second web service lets you visualize detailed information about any named entity.

This information page shows you the full scope of information about a named entity: which class it belongs to (subject concept classes as well as external classes); which other individuals, from other datasets, are identical to them; examples of web services that get queried with information about this named entity; etc.

Exploding the domain of Plato

Now that this background information has been established, let’s take a look at what is happening when we link DBpedia individuals to UMBEL named entities: how that actually works to explode the domain.

Let’s take the example of dbpedia:Plato. This individual is currently defined in DBpedia as:

  • yago:AncientGreekPhysicists
  • yago:PhilosophersOfLanguage
  • yago:PhilosophersOfLaw
  • yago:PoliticalPhilosophers
  • yago:AncientGreekVegetarians
  • yago:AcademicPhilosophers
  • yago:Philosopher110423589

Fine, but what does this mean? What if my system doesn’t know any of these classes? We, as humans, know that Plato is a person, a human being. But it is totally another story for a software agent.

What we want to do here is to explode Plato’s domain to try to find a meaning that my software system can understand.

In UMBEL, the “Plato” named entity is defined as an umbel:Person and an umbel:Intellectual. If you take a look at the detailed report for these two subject concepts, you will be able to see in the section “Broader Subject Concepts” the super-classes that Plato belongs to. So we know that Plato is a social being, a homo sapiens, etc. This is basically what happens with Yago too, except that the conceptual structure (the way to describe the entity) differs.

However one thing that is happening is that we exploded Plato’s domain with classes defined in external ontologies. As you can notice in the sections “Broader External Classes” and “Equivalent External Classes”, Plato is also a: foaf:Person, a foaf:Agent and a cyc:Person.

This means that if my software agent doesn’t know what a “yago:Person100007846” means; it alternatively may know what a foaf:Person or a foaf:Agent means. And if it knows what it means, then it will be able to properly manipulate it: to display it in a special way; to refer to it as a person; so to do whatever it can with information about a “person”.

This exploding the domain works because these external ontologies classes have been referentially linked to a coherent conceptual structure.

The inference path

Let’s take a look at the fundamental reasons why the scenario above works.

First, you, and your system, have to trust the UMBEL named entities dictionaries and the UMBEL subject concept structure to perform the inference that I will explain below. If you and your system trust these linkage assertions, then you will be able to act according to the knowledge that has been inferred.

DBpedia individuals are linked to UMBEL named entities using the owl:sameAs property. This means that DBpedia individual A is identical (same semantic meaning) as the UMBEL named entity B. They both refer to the same individual.

This means that if B is defined as being of rdf:type sc:Person (“sc” stands for Subject Concept), then we can infer that A is defined as being of rdf:type sc:Person too.

If sc:Person is owl:equivalentClass with foaf:Person, we can infer that umbel:B is a foaf:Person, so that dbpedia:A is a foaf:Person too!

We can see similar examples for exploding the domains:

Exploring ConceptualWorks, PeriodicalSeries and NewspaperSeries

In my “UMBEL as a Coherent Framework to Support Ontology Development” blog post from last week, I showed how UMBEL subject concepts acted to create context for linked classes defined in external ontologies. Since DBpedia individuals are instances of classes, and that some of these classes are linked to UMBEL, these subject concept classes also give context to those individuals!

As some examples, go ahead and take a look at the “Named Entities for …” section of these detailed report pages:

The partial list of named entities that are returned by the detailed report viewer shows named entities that mainly come form Wikipedia (so that have links to DBpedia). These subject concepts gives a coherent context to those DBpedia individuals.

You should quickly notice, for example, that dbpedia:Kansas_City_Times is not only a sc:NewspaperSeries, a sc:PeriodicalSeries and a sc:ConceptualWork. You also notice that it is a frbr:Work, a bibo:Periodical and a bibo:Newspaper.

The context created by these UMBEL subject concepts gives not only new power to linked external classes, but also to their instances, such as these DBpedia individuals!

Conclusion

Contexts created by UMBEL subject concepts emerge by the power of linkage that exists between all the subject concepts, and the linkage between those subject concepts classes with classes defined in external ontologies. These contexts are consistent because of the coherence of the structure that is powered by OpenCyc (Cyc).

So far, most Linked Data has been about the “things” or named entities of the world, organized according to either Wikipedia categories or WordNet. These structures may have some internal structural consistency, but were never designed to play the role as a coherent reference framework. The coherence of UMBEL (based on the coherence of Cyc) is a powerful contextual lever for bringing order to this chaos.

Once information gets linked to a coherent framework such as UMBEL, things start to happen; powerful things. And, with each new linkage and relation to additional external ontologies, that power increases exponentially.

I wrote this blog post to show again the power of exploding the domain using DBpedia as an example, and how UMBEL can help to use and to leverage such big datasets.

UMBEL as a Coherent Framework to Support Ontology Development

There are multiple ways to represent the World we live in. Someone will think about something in a way, where someone else next to him will think about the same thing in another way. They will think about it in different ways: different characteristics, different ways to interact with it, different ways to use it, different ways to think about its composition, its relations with other things, and so on.

What is nice is that probably all of these different ways to think about this thing are good: after all, there are many ways to think about the same thing. It is this characteristic of thinking about things in different ways that leads to innovation.

But innovation is also not a game where anything goes. Things that work in the real world and in real ways need to adhere to certain rules, concepts, principles and theories. Continued innovation requires working within these coherent frameworks of natural relationships and order.

So, while a beautiful thing is that we can create new frameworks to think about things differently, not all of those frameworks work as well as others or make sense.

While it is conceivable that one could suppose any new framework or to think about things differently, frameworks that are actually useful should, among other things:

  1. Make sure the development of innovations within the framework is coherent
  2. Make sure the development of innovations within the framework is in context
  3. Help coordinate the development of projects and the cooperation of agents that work on these projects in order to achieve (1) and (2).

What seems clear to me is that the lack of any of (1), (2) or (3) makes innovations difficult and/or less powerful and less useful.

Why Would the Development Of Ontologies be Different?

The Semantic Web is often seen as a place where people describe things in multiple ways and where these things are more or less magically related together. For example, if you can't properly describe something, you only have to create a new ontology, or to extend an existing one, and to publish it, et voilà!

The more I work in this field, the less I believe in this.

Remember my first point? People tend to think about things in different ways. The same logic applies to the development of ontologies (particularly in the development of ontologies!). Two ontologies, intended to describe the same things, can describe them in totally different ways. So, while some of the magic is that both ontologies can perfectly describe these things but only in different ways, there are other aspects that are not magical at all.

The problem here is to have at least one framework that helps people to develop ontologies such that the:

  1. Developed ontologies remain coherent
  2. Developed ontologies are in context
  3. Coordination of the development of ontologies and the cooperation of the agents working on these ontologies projects is effective to achieve goals (1) and (2).

This construct looks familiar, doesn’t it?

What I am proposing here is to use UMBEL as a coherent framework for ontology development. I am not saying that other frameworks can not play a guiding role in ontology development. But I am saying two things. First, some form of reference framework is necessary. And, second, truly useful frameworks must also be consistent and coherent.

What I am stressing here is the importance of conceptual frameworks to develop ontologies that can be used by people, companies and systems to properly and efficiently exchange data; and at some level, to reason over this data, too.

I think that the only way to do this in an efficient way is by grounding ontologies in such conceptual frameworks.

The ultimate goal is to make data exchange and data reasoning effective to people, organizations and systems that consume this sea of data. And I believe that it is not possible to achieve without grounding these efforts in a coherent, conceptual framework.

An Example at Work

Nothing is better than an example to shows the potential of UMBEL as a coherent framework to develop, and cross-link, ontologies.

Let’s take the Bibliographic Ontology as an example, which we just cross-linked to UMBEL in yesterday's version 071 release. (Among a dozen other key ontologies; the list is getting pretty cool!)

The goal is to link BIBO classes to UMBEL subject concepts. The linkage is done using three properties: owl:equivalentClass, rdfs:subClassOf and umbel:isAligned.

But firstly, what is the goal here? We try to do two things when linking such ontologies to the UMBEL framework:

  1. To make sure the ontology (BIBO) is coherent and consistent with other existing ontologies that are linked to the framework (other such ontologies could be FOAF, SIOC, etc.)
  2. To make sure that the design choices of the developed ontology are consistent with the design choices of the framework, and the other ontologies that are linked to that framework.

Both points try to help achieve a grander vision: trying to make the semantic Web a little bit more coherent and easy to use and understand.

The BIBO Linkage

This figure shows how BIBO classes have been linked to UMBEL subject concepts in a set-like schema (click to enlarge the schema):

This schema shows what set belongs to what other set. That way, we can quickly notice that bibo:Patent is equivalent to umbel:Patent. We can also see that both classes belongs to (sub-class-of) bibo:Document, umbel:PropositionalConceptualWork and umbel:ConceptualWork, etc.

We have to keep one thing in mind that we made clear in the UMBEL technical documentation: UMBEL has its own view of the World. UMBEL’s subject concept structure is its view of the World. So these linkages are consistent within the UMBEL framework. Now, let’s continue.

The Context

Remember the three points above? What we have done here is to put BIBO in context. The context is created by the UMBEL conceptual framework. Once this is done, we can check for the coherence between BIBO, UMBEL and all the other ontologies that are linked to the framework.

The figure below shows the context created by UMBEL for BIBO, FOAF and SIOC (click to enlarge the schema):

Considering the current description of these three ontologies, we know that bibo:Document is equivalent to foaf:Document. But there exists no relationship between these two classes and sioc:Item and sioc:Post.

Intuitively we know that there are some relationships between all these classes (at least based on their label). We also have to keep in mind that it is not because a description is not defined (in RDF) that this description doesn’t exist (this is the open world assumption).

That being said, the figure above shows how UMBEL can help us to find such “non-described” relationship between classes of different ontologies. By contextualizing these three ontologies we now find that all these classes are sub-classes of umbel:ConceptualWork. We also know that some sioc:Post belongs to umbel:PropositionalConceptualWork (things written), just like some bibo:Document and foaf:Document stuff.

This means that this linkage — this contextualization — of external ontologies now gives us a common ground to play with: umbel:ConceptualWork. By querying this subject concept we can come up with a full range of related things: BIBO, SIOC and FOAF stuff.

For example, take a look at the section “Narrower External Classes” of the umbel:ConceptualWork detailed report and extend the list of external classes (click on the All Classes . . . link). All these things are conceptual works. This fact is explicated by UMBEL even if no relations, or a small number, is described in these ontologies, related to the other ontologies. Also take a look a the list for umbel:PropositionalConceptualWork.

This also shows the coherence of the design of each ontology.

The Coherence

So, once we have the context in place, we are on our way to achieve coherence. UMBEL is 100% based on OpenCyc and Cyc, which are internally consistent and coherent within themselves. We thus use these coherent frameworks to make the mappings to external ontologies coherent, too.

The equation is simple:

“a coherent framework” + “ontologies contextualized by this framework” = “more coherent ontologies”

This context and this coherence helps us to develop ontologies in two ways:

  1. It helps us to make sure the design of an ontology is good
  2. It helps us to make sure the designed ontology is coherent with other existing external ontologies

For example, when I linked BIBO classes to UMBEL subject concept classes, I found that a bibo:Series was a sub-class of umbel:ConceptualWorkSeries. Then I found that bibo:Periodical was the same thing as a umbel:PeriodicalSeries. However I had an issue: a bibo:Series was a sub-class of bibo:Collection and bibo:Periodical was also a sub-class-of bibo:Collection. Then I found that umbel:PeriodicalSeries was a sub-class of umbel:ConceptualWorkSeries. Then the question arose: why bibo:Periodical is not a sub-class of bibo:Series instead of bibo:Collection? This is what I will propose for the next iteration of BIBO.

Now, what about this helping to increase the coherence between external ontologies?

One good example I have is related to SIOC and FOAF. When I linked SIOC to UMBEL, Kingsley asked me why I didn’t link sioc:Item. My answer was simple: I cant do this since if I make this linkage, the coherence of UMBEL will be disturbed. The problem was that sioc:Item was a sub-class-of foaf:Document. But considering sioc:Items definition, and foaf:Documents definition and linkage to UMBEL, by making the linkage of sioc:Item to UMBEL would create some incoherence in the framework because of its relationship with foaf:Document.

From this discussion with Kingsley, this thread appeared on the SIOC mailing list, and the link from sioc:Item to foaf:Document has been removed.

These are the two general cases where UMBEL, as a coherent framework, can help the development of ontologies.

So, by achieving points (1) and (2), we are on the way to achieve point (3): the coordination of the development of ontologies and the cooperation of the agents working on these ontologies projects is effective to achieve goals (1) and (2).

The Final Mapped Relations

So, after application of this process and thinking, here are the UMBEL-BIBO mappings:

You can look at Appendix A to the UMBEL technical document (PDF or online); additionally you will see similar mappings for the existing dozen or so ontologies presently mapped to UMBEL. In combination, these give us the ability to Explode the Domain!

Descriptive Subject Concepts: Icing on the Cake

All of the description above relates to the mapping between the BIBO and UMBEL ontologies (and therefore other external ones). But, of course, we also now have the full scope of UMBEL subject concepts that we can also now apply to describe what the actual BIBO citations are about.

So, while we have structural ontology relationships that can be leveraged, we also now have a common vocabulary to describe the subject matter of what these citations are about. Use of these UMBEL subject concepts now allow us to cluster and retrieve citations by subject matter.

In this manner, UMBEL becomes a consistent tagging vocabulary for describing what citations and references are about. Want everything about weaving or galaxies or opera or anything, for example? Simply characterize your citations by appropriate UMBEL subjects and then use them as part of your retrieval filters.

This makes clear that UMBEL is some kind of Hydra: it can be used as a conceptual framework to help make ontologies (vocabularies) coherent and consistent, and at the same time, it can act as a conceptual description framework that describes the “matter” of things. This means that a subject concept can describe the “nature” of a thing and the “matter” of another thing at the same time.

Conclusion

UMBEL is becoming a wonderful tool that can be used in many ways. It is a vocabulary that is instantiated in a subject concept structure. It can be used not only to categorize things and to help find things, but also to define things, and to develop ontologies that define other things. We are on our way to achieve these three goals:

  1. Develop ontologies that are in context
  2. Develop ontologies that remain coherent
  3. Coordinate the development of ontologies and the cooperation of the agents working on these ontologies projects sufficient to achieve goals (1) and (2).

As usual, I’d like to thank my UMBEL co-editor and colleague, Mike Bergman, for his discussions and assistance on this material.




This blog is a regularly updated collection of my thoughts, tips, tricks and ideas about my semantic Web researches and related software development.


RSS


Follow

Get every new post on this blog delivered to your Inbox.

Join 10 other followers:

Or subscribe to the RSS feed by clicking on the counter:

RSS