Tag Archive for 'umbel-2'

New UMBEL 1.50 Ships With 20 Linked Ontologies

I am proud to announce the immediate release of UMBEL version 1.50. This is a major effort that took a year to release.

What is UMBEL?

Let’s start by explaining what is UMBEL for the ones that never encountered this project before. UMBEL stands for “Upper Mapping and Binding Exchange Layer“. It is a conceptual structure that is designed to help content interoperate between systems.

UMBEL is a coherent general structure of 34 000 reference concepts which provides a scaffolding to link and interoperate other datasets and domain vocabularies. The conceptual structure is organized in a structure of 31 mostly disjoint SuperType.

UMBEL is written in OWL 2 and SKOS.

What are UMBEL’s Objectives?

UMBEL’s main goals are:

  • To create a scaffolding for defining knowledge graphs
  • To create a rich semantic to identify and help disambiguating entities
  • To help expend queries to semantic search engines
  • To help inter-linking ontologies to create a coherent ontological environment
  • To help structure and federate information silos

What is new in UMBEL version 1.50?

Many things changed in UMBEL 1.50: additional of new concepts, multiple structural fixes and improvements, etc. However there are 3 major changes that occurred in this release:

  1. Complete update and addition of linkages between UMBEL reference concepts and related classes existing in external ontologies
  2. Removal of all the named individuals from UMBEL. UMBEL is now only composed of classes reference concepts
  3. Reshaping of the SuperType upper structure by adding new ones and removing some of them

For the complete list of UMBEL changes, I would strongly suggest you to read Mike’s blog post about this UMBEL release.

UMBEL Mapping to External Ontologies

One interesting aspect of the UMBEL structure is to use the coherent structure to federate information silos. We can do that by linking ontologies and vocabularies, used to describe entities indexed in these silos, directly into UMBEL.

But what does that mean? Let’s take a look at a portion of the UMBEL structure related to actors, authors and their relations to humans:

actors-authors-humans

Now let’s assumes that we have two data sources:

  1. DBpedia from which we want to use its Journalist entities, and
  2. Musicbrainz from which we want to use its solo musical artist entities

The journalist entities of the DBpedia data source belong to the dbpedia:Journalist class of the DBpedia ontology. The Musicbrainz solo musical artists belong to the mo:SoloMusicArtist class of the Music Ontology. If you check each of these ontology, you won’t find any connections between these two classes. They appears to be living in two different [conceptual] worlds.

However, what happens if these two classes get connected to some UMBEL reference concepts? Let’s take a look:

dbpedia-mo-connections

What we did here is to connect the two classes to the UMBEL reference structure using the equivalent to property. What we are stating with these assertions is that these two classes are equivalent to these other classes in UMBEL. This seems harmless, but when we start thinking about that, something special is happening.

The special thing that is happening is that we can now query the different datasets (Musicbrains and DBpedia) on new ground. We can now query them such that if I request to get the list of all humans, then I can and I will get all soloist and all journalist. If the data store to get all authors, then I would get all DBpedia journalists and maybe authors of other datasets that may be linked to the UMBEL reference structure.

This is an illustration of how UMBEL can be used to federate information silos.

The good news is that the UMBEL reference structure is already linked to 20 ontologies used by different organizations to define their data sources:

  1. DBPedia Ontology – Links between the DBpedia Ontology classes and the UMBEL Reference Concepts. Half of them comes from the linkage between Proton and UMBEL, and half the others come from hand mapping
  2. Geonames – Geonames
  3. Opencyc – OpenCyc Ontology
  4. Schema.org – Schema.org ontology defines entities known by Google and other search engines
  5. Wikipedia – Links between the Wikipedia pages and the UMBEL Reference Concepts
  6. DOAP – DOAP(Description of a Project) is a vocabulary for project description.
  7. ORG – The ORG (Core Organization) Ontology is a vocabulary for describing organizational structures for a broad variety of types of organization
  8. OO – OO(Open Organizations) is a vocabulary providing supplementary terms for organizations that wish to publish open data about themselves
  9. TRANSIT – TRANSIT(Transit) is a vocabulary for describing transit systems and routes
  10. TIME – The TIME(Time Ontology) defines temporal entities
  11. BIBO – BIBO (Bibliographic Ontology)
  12. CC – CC (CreativeCommons Ontology)
  13. Event – Event Ontology
  14. FOAF – FOAF (Friend Of A Friend Ontology) used to describe people and organizations
  15. GEO – WSG84 Geographic Ontology
  16. MO – MO (Music Ontology)
  17. PO – PO (Programmes Ontology)
  18. RSS – RSS (Really Simple Syndication Ontology)
  19. SIOC – SIOC (Semantically-Interlinked Online Communities Ontology)
  20. FRBR – FRBR (Functional Requirements for Bibliographic Records)

According to Linked Open Vocabularies (LOV) service, the UMBEL reference structure, along with these 20 ontologies linkage would enable you to reach 504 datasets tracked by LOV.

Major UMBEL Release: 1.10

After more than 2 years, we are now finally releasing a new version of the UMBEL ontology and reference concept structure. One might think that we haven’t worked on the project all that time, but it is not strictly true. umbel_logo_260_160

We did improve the mapping to external vocabularies/ontologies, we worked much on linking Wikipedia pages to the UMBEL structure, but we haven’t had time to release a new version… until now!

For people new to the ontology, UMBEL is a general reference structure of about 28,000 reference concepts, which provides a scaffolding to link and interoperate other datasets and domain vocabularies. Its main purpose is to have a coherent conceptual structure that we can use to link and interoperate unrelated data sources. But it can also be used as a conceptual structure to be used to describe information like any other ontologies.

What is new with the ontology?

The major change in UMBEL is not the structure itself, but the piece of software used to generate it. In fact, the previous system we developed for generating UMBEL was about 7 years old. It was a bit clunky and really not that easy to work with.

Based on our prior experience with UMBEL, we choose to dump it and to create a brand new UMBEL reference structure generator. This new generator has been developed in Clojure and uses the latest version of the OWL API. It makes the management of the structure much simpler, which means that it will help in releasing new UMBEL version more regularly. We also have a suite of tools to analyze the structure and to pinpoint possible issues.

Other than that, we updated the Schema.org, DBpedia Ontology and Geonames Ontology mappings to UMBEL. This is a major effort undertaken by Mike for this new version. The mappings are composed of:

  • 754 rdfs:subClassOf relationships between Schema.org classes and UMBEL reference concepts
  • 688 rdfs:subClassOf relationships between DBpedia Ontology classes and UMBEL reference concepts
  • 682 rdfs:subClassOf relationships between Geonames Ontology classes and UMBEL reference concepts

These new mappings will help manage data instances that use these external ontologies/schemas in a broader conceptual structure (which is UMBEL). This enables us to be able to reason over this external data using the UMBEL conceptual structure even if these external data sources didn’t originally use UMBEL to describe their data. That is one of the main features of UMBEL.

We also managed to add a few hundred UMBEL reference concepts. Most of them were added to create these new linkages with the external ontologies. Others have been added because they were improving the overall structure.

A few weeks back, we found an issue with the umbel:superClassOf assignations, which has also now been resolved in version 1.10.

In the previous versions of UMBEL, the preferred labels were not unique. There were a few hundred of the concepts that were having the same preferred labels. This was not an issue in itself, but this was not a best practice to create an ontology. We managed to remove all these non-distinct preferred labels and to make all of them unique.

We added a few skos:broader and skos:narrower relationships between some of the reference concepts. In the previous versions, all the relationships were skos:broaderTransitive and skos:narrowerTransitive properties only.

Finally we made sure that the entire UMBEL reference structure (Core + the Geo module) was absent of any inconsistencies and that it was satisfiable.

What is new with the portal and web services?

This new version of UMBEL also led us to create a few new features to the UMBEL website. The most apparent feature is the new External Linkage section that may appear at the top of a reference concept page (obviously, it will not appear if there are no external links for a given reference concept). This section shows you the linkage between the UMBEL reference concept and other external classes:

umbel_linkage_module

Another feature that you will notice on this screenshot is the Core blue tag at the right of the URI of the reference concept. This tag is used to tell you from where the reference concept is coming. Another tag that you may encounter is the green Geo tag, which tells you that the reference concept comes from the UMBEL Geo module. The same tags appear in the search resultsets:

search_modules

What is next?

Because UMBEL is an ontology, by nature it will always evolve over time. Things change, and the way we see the World can always improve.

For the next version of UMBEL, we will analyze the entire UMBEL reference concept structure using different algorithms, heuristics and other techniques to analyze the conceptual structure and to find conceptual gaps in it. The goal of this analysis is to tighten the structure, to have a better conceptual hierarchy and a more fine-grained one.

Other things we want to do in other coming versions are to improve the Super Types structure of UMBEL. As you may know, many of the Super Types are non disjoint because some of the concepts belong to multiple Super Type classes. What we want to do here is to create new Super Types classes that are the intersection between two, or more, Super Types that will be used to categorize these concepts that belong to multiple Super Types. That way, we will end-up with a better classification of the UMBEL reference concepts from a Super Types standpoint.

Another thing we want to do related to the UMBEL web services is to update them such that you can query the linkage to the external ontologies. For now, you can see the linkage when querying the sub-classes and super-classes of a reference concept. But you cannot query the web services this way: give me all the sub-classes-of the http://schema.org/FireStation class, for example.

As you can see, the UMBEL ontology and web services will continue to evolve over time to enable new ways to leverage the conceptual structure and external data sources.

UMBEL: New Shortest Path Web Service & Tag Web Documents

We just released a new UMBEL ontology graph analysis web service endpoint: the Shortest Path web service endpoint. umbel_ws

The Shortest Path Web service is used to get the shortest path between two UMBEL reference concepts by following the path of a transitive property. The concepts that belong to that path will be returned by the server.

This web service is similar to the degree web service endpoint but the actual path is shown. This web service is (marginally more useful) than degree. So if you don’t need to know the actual concepts that participate in the shortest path between two concepts, then you should be using the degree web service endpoint instead.

The graph created by the UMBEL reference concepts ontology is a mostly an directed acyclic graph (DAG). This means that a given pair of concepts is not necessarily linked via all the properties. In these cases, the shortest path returns an error message rather than the path concepts.

Intended Users

This new web service endpoint is intended for users that want to perform graph/network analysis tasks on the UMBEL web service endpoint.

The Web Service Endpoint

The web service endpoint is freely available. It can return its resultset in JSON or in EDN (Extensible Data Notation).

This endpoint will return a vector (so the order of the results is important) of concepts that participate into the shortest path. For each concept, its URI and preferred label are returned.

The Online Tool

We also provide an online shortest path tool that people can use to experience interacting with the web service.

The user first needs to select the two concepts for which he wants to find the shortest path between the two. Then he has to select the transitive property he want to use to find the path.

Once the user clicks the Get Shortest Path button, he will get list of concepts, and the order, that compose the path.

If no path exists between the two concepts for the selected property, an error message is displayed to the user.


shortest-path-ui

Tagging Web Documents with the UMBEL Taggers

Another improvement included with this release is the enhancement of the UMBEL taggers12. It is now possible to tag any document accessible on the Web. The only thing you have to do is to provide a URL where the tagger will find the document to download and tag.

The user interface for the taggers also was modified to expose this new functionality. You now have the choice to give a text or a URL as input to the endpoints:


new-tagging-ui




This blog is a regularly updated collection of my thoughts, tips, tricks and ideas about data mining, data integration, data publishing, the semantic Web, my researches and other related software development.


RSS Twitter LinkedIN


Follow

Get every new post on this blog delivered to your Inbox.

Join 92 other followers:

Or subscribe to the RSS feed by clicking on the counter:




RSS Twitter LinkedIN