Starting to Play with the UMBEL Ontology

 

I am really proud to announce the first public release of the UMBEL Ontology and its subject structure after one year of hard work with Mike.

As UMBEL is introduced in the UMBEL Technical Documentation:

“UMBEL (Upper-level Mapping and Binding Exchange Layer) is a lightweight ontology for relating external ontologies and their classes to UMBEL subject concepts. UMBEL subject concepts are conceptually related together using the SKOS and the OWL-Full ontologies. They form a structural ‘backbone’ comprised of subject concepts and their semantic relationships. By linking external ontologies to this conceptual structure, we explode the domain of the linked classes by leveraging this conceptual structure.

UMBEL defines “subject concepts” as a distinct subset of the more broadly understood concept such as used in the SKOS/OWL-Full controlled vocabulary, conceptual graphs, formal concept analysis or the very general concepts common to many upper ontologies. We define subject concepts as a special kind of concept: namely, ones that are concrete, subject-related and non-abstract.

UMBEL contrasts subject concepts with abstract concepts and with named entities. Abstract concepts represent abstract or ephemeral notions such as truth, beauty, evil or justice, or are thought constructs useful to organizing or categorizing things but are not readily seen in the experiential world. Named entities are the real things or instances in the world that are themselves natural and notable class members of subject concepts. More detailed distinctions are provided under Terminology and Definitions below.”

Mike Bergman wrote a really good introduction blog post about UMBEL that lists all the supporting material and services that exists to get starting with UMBEL.

In this blog post I will write about one example that shows how to leverage UMBEL in two different ways: (1) how to use UMBEL to “explode the domain” of an existing ontology and (2) how to use UMBEL when an ontology doesn’t exist to describe a certain domain. I will also write other blog posts in the coming days to show more ways to leverage UMBEL in different settings and how to use it to solve other kind of real world problems.

Some of this new material will begin to hint at Zitgist’s own plans for using UMBEL.

Linking FOAF to UMBEL to explode its domain

How many times have people tried to use FOAF to describe organizational entities? In the end, everything ends up being assigned to foaf:Organization. A company, a NGO, or any other kind of organizations were all foaf:Organization(s) or foaf:Group(s). In most cases the result was unsatisfactory and everything ended up being the same “classification”.

But I don’t want to describe a business as an “Organization”, or a NGO as another “Organization”. They are two quite different concepts, even if the upper concept that links them is an “Organization”. However, there are no ontologies (that I know of) that describe businesses and NGOs; and FOAF is not expressive enough to do that distinction. However, is it FOAF’s goal to be that expressive? Possibly; but not in its current state. So what we want here is to extend it: to explode its domain!

And, it is what we will do with UMBEL.

The goal is to link FOAF classes to UMBEL subject concepts so that we can extend FOAF’s classes with more general and more specific concepts such as Business and NGO.

If you take a look at how the FOAF ontology has been linked to UMBEL, you will notice that a foaf:Organization is equivalent to an sc:Organization. Note: the linkage of external ontologies classes is consistent within UMBEL. It is UMBEL’s view of the World.

Let’s take an example to show what I mean. What I want is to describe the Zitgist LLC business; to describe it as a business, and not an organization. However I want to be able to re-use properties described in other ontologies to describe this business. So, here is an example of how I can describe this company using a UMBEL subject concept and external ontologies properties:

<http://zitgist.com/about/> a sc:Business ;

foaf:name “Zitgist LLC.” ;

foaf:birthday “2006-10-20” ;

foaf:logo <http://zitgist.com/imgs/zitgistlogo2_110_55.gif> ;

foaf:fundedBy <http://www.openlinksw.com> ;

bio:olb “””Zitgist provides quality Linked Data products and services. Linked Data is based on open standards to interconnect any form of relevant information on demand and in context. Zitgist’s capabilities range from the consumer Web plug-in zLinks to enterprise linked data transformation and deployment. Our expertise spans from data, standards and protocols to tools, user interface design, and scalable architectures. Zitgist innovation helps make the connections that matter. Let us show you how our approach to Linked Data can bring the power of the network effect to your data assets and global information.”””@en ;

foaf:based_near [ geo:Point [geo:lat “42.455”, geo:long “-71.218”] ] ;

foaf:homepage <http://zitgist.com> ;

foaf:made <http://umbel.org/about/> ;

foaf:made <http://browser.zitgist.com/about/> ;

foaf:made <http://pingthesemanticweb.com/about/> ;

foaf:made <http://musicontology.com/about/> ;

foaf:made <http://bibliontology.com/about/> ;

foaf:made <http://talkdigger.com/about/> .

As you can notice with this example, Zitgist is defined as a sc:Business. Well, you are probably wondering what is a sc:Business? Let’s take a look at the subject concept’s detailed report: sc:Business.

The next question is: why can I use all these properties to describe a sc:Business? The quick answer is because foaf:Organization is linked (equivalent to) sc:Organization and that sc:Business is a sub class of sc:Organization. You can read the proof here; and check the figure below that shows the inference path that leads us to this result.

(Note: this is what we refer to: exploding the domain of FOAF)

Analyzing a SC with the Detailed Report

The Detailed Report web service tool helps users to check which external class is linked to which subject concept and the nature of the linkage. Additionally it helps people to know what properties can be re-used to describe an individual of that class. Here is a quick overview of what information can be accessed when using this detailed report tool. Let’s take the sc:Business detailed report page:

Named Entities

The Named Entities section lists a couple of named entities that belong to this subject concept class. These are direct, or inferred, Yago named entities that belongs to this subject concept.

More General External Classes
The More General External Classes section lists the external super-classes linked to this subject concept. So we can quickly notice that a sc:Business is a foaf:Organization, a foaf:Group and a foaf:Agent.

In-domain-of and In-range-of
The in-domain-of and in-range-of sections list the properties, defined in some external ontologies, that can be used to describe that subject concept. So most of the properties that I used to describe the Zitgist business above should appear in this list (except if the ontology hasn’t yet been linked to UMBEL; but a dozen are already so as shown in Appendix A of the main technical document).

More General and Specific Subject Concepts

The More General Subject Concepts and the More Specific Subject Concepts sections list the super-concepts and the sub-concepts of the current subject concept (sc:Business in that case). So, we can use UMBEL to describe an even more specific kind of business, for example: an Airline Company. Or we can use UMBEL to describe a more general kind of business: a Commercial Organization.

Finally this Detailed Report Web Service helps people to put a given subject concept into context: how it is related to external ontologies classes; how we can use properties to describe individual of these concepts; how is it related to other subject concepts? It is the tool to answer these questions.

Conclusion

In this blog post we saw how external ontologies classes can be linked to UMBEL to explode their domain: so to enhance their expressiveness. Additionally we saw how to use UMBEL web services to analyze a subject concept and to see its relations with other subject concepts, external classes and properties.

However this is just the beginning of our exploration of UMBEL. Many things are waiting for us at the corner. I am starting to write a series of blog posts that will show you different uses and characteristics of UMBEL. All of them will be explained using real world use cases and challenges. We will see how named entities are related to UMBEL subject concepts. We will see how named entities data sources such as Yago and the John Peel Sessions have been linked to UMBEL. We will see how the UMBEL Vocabulary can help people to describe subject relationship between: a RDFS class that can be linked to a subject concept (using umbel:isAligned and owl:equivalentClass); a named entity to a subject concept (using umbel:isAbout and umbel:linksEntity); and a named entity to another named entity (using umbel:isLike and owl:sameAs).

As you can notice, this is just the beginning. In meantime you can read the technical documentation to have a better understanding of UMBEL. And additionally you can read all the volumes that have been written to explain UMBEL’s evolution and the steps that lead to the creation of this of this first public release of the ontology.

Finally, you can now start using UMBEL in your own applications. I would suggest you to revisit the UMBEL web services by reading my previous blog post: Exploding the Domain: UMBEL Web Services by Zitgist. Additionally I would suggest you to try to dereference subject concepts URIs such as: http://umbel.org/ns/sc/Project and http://umbel.org/ns/sc/Organization. All UMBEL Vocabulary’s classes and properties are dereferencable. All UMBEL named entities are also dereferencable along with all subject and abstract concepts.

Enjoy!

Exploding the Domain: UMBEL Web Services by Zitgist

I am pleased to announce the first phase of the public release of the UMBEL Web Services by Zitgist. This first release consists of a series of user interfaces in-front of several UMBEL web services.

This blog post shows and explains what these web services are about and how people will be able to use them to leverage UMBEL to create new ontologies, to instantiate new data sets and to interlink external ontologies to explode their domains.

Background

For the last four to six months we have been in the process of creating the UMBEL ontology. We have been doing research to find the best basis datasets; we have been cleaning these datasets for UMBEL’s purposes; and we have been developing the ontology and its principles. Starting today, we begin the release process for UMBEL:

  1. UMBEL web services’ user interfaces
  2. UMBEL ontology (OWL-Full)
  3. UMBEL ontology technical documentation
  4. UMBEL subject concepts’ structure (SKOS + OWL-Full) & named entities instantiation
  5. UMBEL web services endpoints.

UMBEL Ontology & Subject Concept Structure

Before starting to show and explain the UMBEL web services’ user interfaces’, I have to give some background information about the UMBEL ontology’s principles, and how the subject concept structure has been created. All this information will be discussed and explained at length in the UMBEL ontology technical documentation that is about to be published; but I have to give some technical background information in order to explain what these web services are about.

As described by Mike, UMBEL’s purposes are:

“[…] to provide a lightweight structure of subject concepts as a reference to what Web content or data “is about”, what is called a concept schema in SKOS […]

Think of the backbone as a set of roadsigns to help find related content. UMBEL is like a map of an interstate highway system, a way of getting from one big place to another. Once in the right vicinity, other maps (or ontologies), more akin to detailed street maps, are then necessary to get to specific locations or street addresses.

By definition, these more fine-grained maps are beyond UMBEL’s scope. But UMBEL can help provide the context for placing such detailed maps in relation to one another and in relation to the Big Picture of what related content is about.

These subject concepts also provide the mapping points for the many, many thousands (indeed, millions) of specific named entities that are the notable instances of these subject concepts. Examples might include the names of specific physicists, cities in a country, or a listing of financial stock exchanges. UMBEL mappings enable us to link a given named entity to the various subject classes of which it is a member.

And, because of relationships amongst subject concepts in the backbone, we can also relate that entity to other related entities and concepts. The UMBEL backbone traces the major pathways through the content graph of the Web. For some visualizations of this subject graph, see So, What Might The Web’s Subject Backbone Look Like?”

A four-article introduction to UMBEL can be read from Mike’s blog at:

UMBEL is a 21 000 subject concept structure that has been derived from the OpenCyc ontology. The structure is described in SKOS and OWL-Full. Each concept is an invididual of the skos:Concept class, which are themselves OWL classes. This dichotomy is the basis of UMBEL. Since the subject concepts are classes, this mean that we can relate these classes to external ontology classes using properties such as rdfs:subClassOf and owl:equivalentClass.

So what does all of this mean? It means that once the linkages between UMBEL subject concepts and external ontologies classes are made, the following becomes possible: 1) the UMBEL subject concept structure can be used to describe (instantiate) things using the UMBEL data structure; 2) external ontology properties can be re-used to describe these new instances since external ontologies classes are linked to UMBEL subject concept classes; and 3) in some cases, the properties defined in these ontologies can be used in relation with UMBEL subject concept classes. The forthcoming technical documentation about this stuff will provide more detailed explanation. For the moment, just accept these assertions as being true.

The UMBEL web services (user interfaces) have been created to help people to manage these relationships between UMBEL subject concepts classes and external ontology classes. People will use the services to infer facts from the structure of the subject concepts, to check if a class is a sub-class, a super-class or an equivalent class of another class. They will also use the services to see what properties, defined in external ontologies, can be re-used, and on which subject concept.

Let the show begin!

UMBEL Web Services Index Page

The entry page lists all the available web services. For each web service, you have a link to the web service user interface, a link to an about page explaining the basis of the web service, and a link to the technical documentation of the web service endpoint: how to communicate with the endpoint web server and how to interpret the answer sent by the web service.

Take note that the web service endpoints are not yet publicly available, and that this endpoint page is provided now for information purposes.

Eleven UMBEL Web Services

  1. Find Subject Concepts
  2. Subject Concept Report
  3. Subject Concept Detailed Report
  4. List Sub-Concepts & Sub-Classes
  5. List Super-Concepts & Super-Classes
  6. List Equivalent External Classes
  7. Verify Sub-Class Relationship
  8. Verify Super-Class Relationship
  9. Verify Equivalent Class Relationship
  10. Subject Concepts Explorer
  11. Yago Ontology — a little help from our friends.

Searching the UMBEL Subject Concept Structure

The first thing people will want to do is to search within the UMBEL subject concept structure. The “Find Subject Concepts” web service helps people to locate potential subject concept they are looking for.

If someone looks at the Find Subject Concepts page and performs a search for the keyword “project”, he will get this list of subject concepts:

umbel_find.png

Note: all subject concepts are ordered alphabetically and the search has been performed on the subject concept label and their semsets (and not in their definition).

The “finding” web service along with all the inferencing web services use the same result page layout: you have a list of subject concepts with their human readable definition (note: 8000 definitions out of 21 000 have yet to be created). If a user clicks on a result, he will be redirected to the Report and the Detailed Report user interfaces. Additionally, a user can click on the small “earth” icon to start browsing the surrounding subject concepts nodes in the Explorer visualization tool.

Inferencing the UMBEL Subject Concept Structure

A series of web services has been created to infer facts in the UMBEL subject concept structure. There are the two main categories of inferencing web services:

  1. The ones that list subject concepts that are more general, more specific or equivalent to a given subject concept
  2. The ones that answer the question: is this subject concept a sub-concept, a super-concept or an equivalent concept to this other subject concept?

These web services can be used not only to infer these facts on UMBEL subject concepts, but also on external ontology classes. There are a couple of examples of what can be done with these inferencing web services:

Note: some people may notice that the doap:Project external ontology class is a sub-class of the “Project” subject concept. This is not intuitive for humans, but this situation will be explained at length in the UMBEL Ontology Technical Documentation. To make a long story short: considering the nature of the current definition of the doap:Project class, we couldn’t say that it is equivalent to the “Project” UMBEL subject concept.

Visualizing the UMBEL Subject Concept Structure

While inferencing and lookup are good, we still have some issues when we try to “feel” what the UMBEL subject concept structure is. The following two user interfaces will do their best to help people visualizing the subject concepts description and their relations with other subject concepts and external ontologies classes.

Lets start with a wonderful visualization tool, created by Moritz Stefaner, and used by UMBEL to let people visualizing and browsing the data structure.

Lets start by browsing the relationship of the “Project” subject concept:

umbel_explorer.png

You can navigate from one node to another by clicking any of the circles. Each circle is an UMBEL subject concept or an external ontology class.

When a node is selected, its concept description is displayed in the right sidebar of the interface.

Note there are four different kinds of relationship between the concepts:

  • Blue (B). (concept A) — broader than –> (Concept B). concept A is more general than concept B
  • Red (N). (concept A) — narrower than –> (Concept B). concept A is more specific than concept B
  • Green (=). (concept A) — equivalent to –> (Concept B). concept A is equivalent to concept B
  • Mauve (I). (concept A) — is a –> (Concept B). concept A is an instance of the concept B

As each node is selected, the display refreshes and shows the new set of relationships for the current node (subject concept or external class). Note the dropdown list shown at the upper right of the display enables you to return to previous views or steps.

The Detailed Subject Concept Report

The detailed subject concept report is the tool to know everything about a specific subject concept. This is not really a web service, but a user interface that uses all existing UMBEL web services to display a detailed report of a subject concept, and all its relations with other UMBEL subject concepts and external ontology classes and properties.

There is the detailed report of the “Project” subject concept:

umbel_detailed_repost.png

There is the list of information available from that detailed report page:

  • UMBEL Subject Concept Name — the name of the subject concept
  • Semset — the preferred label and its alternative labels used to refer to this concept. The alternative labels are aliases, synonyms, collocations, etc.; related to the preferred label of the subject concept
  • Definition — the human readable definition of the subject concept
  • Equivalent External Classes — the classes from external ontologies that refer to this same subject concept. Note that the UMBEL Ontology Technical Documentation will explain how the equivalence relation between an external ontology class and an UMBEL subject concept is done
  • Named Entities — a list of named entities related to this UMBEL subject concept. Most of the time, the subject concept has the “type of” characteristic for these named entities. For example, for the subject concept “Person”, “Albert Einstein” is of type “Person”. The first named entities data set that has been used to create this list of named entities is Yago (more about this below).
  • More General External Classes — these are the classes from external ontologies that refer to a more general concept. Note that the UMBEL Ontology Technical Documentation will explain how the super-class relation between an external ontology class and an UMBEL subject concept is done
  • More Specific External Classes — these are the classes from external ontologies that refer to a more specific concept. Note that the UMBEL Ontology Technical Documentation will explain how the sub-class relation between an external ontology class and an UMBEL subject concept is done
  • In-domain-of — this is a list of properties defined in external ontologies where an individual of the UMBEL subject concept class can be used in the domain of the property. For example, for the subject concept “Person” the in-domain-of property: “foaf:interest (domain: foaf:Person)” means that an individual of the class umbel:Project can re-use the property foaf:interest that is defined in the FOAF ontology in its domain (<umbel:Person> <foaf:internet> <…>). Note that the UMBEL Ontology Technical Documentation will explain how the in-domain-of relation between an external ontology class and an UMBEL subject concept is done
  • In-range-of — this is a list of properties defined in external ontologies where an individual of the UMBEL subject concept class can be used in the range of the property. For example, for the subject concept “Person” the in-range-of property: “doap:developer (range: foaf:Person)” means that an individual of the class umbel:Project can re-use the property doap:developer that is defined in the DOAP ontology in its range (<…> <doap:developer> <umbel:Person>). Note that the UMBEL Ontology Technical Documentation will explain how the in-range-of relation between an external ontology class and an UMBEL subject concept is done
  • More General Subject Concepts — this is the list of more general internal UMBEL subject concepts related to the concept
  • More Specific Subject Concepts — this is the list of more specific internal UMBEL subject concepts related to the concept.

As you can notice, all the relations between any UMBEL subject concept to other subject concepts or external ontologies classes and properties is shown in this detailed report page.

This detailed report page was created not only to show people what UMBEL subject concepts are. I envision that people (more specifically ontologies developer & ontologies users) will also use it to check the current linkage between UMBEL and external ontologies and how to use UMBEL to instantiate and describe resources in RDF, etc. The UMBEL ontology documentation will describe some linkage and re-using use cases in further detail.

Linked External Ontologies and Named Entities

Lets take a deeper look at the named entities section of the detailed report of the “Person” subject concept:

umbel_named_entities.png

These named entities are individuals belonging to the class umbel:Person. If you click on one of these person names, you will notice that they are described the Yago data set. How is this possible?

To make another long story short: umbel:Person is an equivalent class to the cyc:Person class; cyc:Person is an equivalent class to the wordnet:Person class; yago:R._B._Bennett is an individual belonging to the same wordnet:Person class. So we can infer that yago:R._B._Bennett is an individual also belonging to the umbel:Person class. However, these technical details will be explained at length in the UMBEL ontology documentation.

But the truth is that this is not the most wonderful thing around. The most wonderful thing is when we understand what that really means (the linkage between yago:R._B._Bennett and umbel:Person (or any other data sets linked to UMBEL)). This means that this linkage is literally exploding the domain of each of these linked named entities. In fact, now we know this about yago:R._B._Bennett:

  • It is an umbel:Person
  • It is a cyc:Person
  • It is a foaf:Person & a foaf:Agent
  • It is a umbel:HomoSapiens
  • It is a umbel:SocialBeing
  • That we can re-use the foaf:birthday, foaf:name, doap:translator, dcterms:creator, etc.; external ontologies properties to describe this person.

We can infer all these things, and much more, about yago:R._B._Bennett only by linking it to UMBEL. We just contextualized it; and then we exploded its domain!

This is what UMBEL is about; this is the value it creates; and its contribution to the Semantic Web.

Conclusion

This is just the beginning of UMBEL. Currently ten external ontologies have been linked to UMBEL. The attentive eye will notice some strange results in the in-domain-of and in-range-of detailed report sections. More work has to be put in the linkage; however as you will notice in the technical documentation of UMBEL, some weird results come from the way some ontologies are defined. So, these ontologies self-definition create some of these weird results. So this mean that these UMBEL tools won’t only help by linking external ontologies, but they will also help to define new ontologies and to fix existing ones.

Stay tuned; more stuff will be released in the coming weeks and months.

The emergence of UMBEL and Linked Data

Since Mike and I first released UMBEL in 2007, we have not stopped working on it: we have done much research, we defined its concepts and principles, we designed and created it:

the ontology and the instantiation of its subject concepts, abstract concepts, semsets and named ontologies. We intensified our efforts in the last six months so that we nearly worked full time on this project.

We are now starting to release more documentation about the outcome of our work so far. Mike starts to release a really good series of blog posts describing the grounding of this effort. The first blog post that has been published is called A re-Introduction of UMBEL – Part 1 of 4 on foundations of UMBEL. This blog post explains the foundation concepts of UMBEL.

Later this week he will publish three other blog posts that explains what UMBEL adds to Linked Data, how named entities are integrated in this framework and finally how UMBEL relates to its older brother: Cyc and OpenCyc.

So stay tuned on Mike’s blog to read the series of four blog posts that put the basis to future releases and discussions about UMBEL and Linked Data.

Next development of UMBEL

In mean time, we continue our hard work to release the first draft of the UMBEL ontology and a first version of the instantiation of its subject concepts, its abstract concepts, its named entities and their related semsets. Also we will release a first mapping between UMBEL’s subject concepts and related external ontologies classes along with the proper grounding documentation that explains all the things evolved with these instantiations, these linkages and the UMBEL ontology itself.

UMBEL: Upper-level Mapping and Binding Exchange Layer

umbel_medium.png

Mike Bergman released the first draft of its UMBEL ontology. Me and some other people helped him to come up with that new ontology.What is UMBEL? UMBEL is a lightweight subject reference structure. People can see it as a pool of subjects. Subjects are related together at a synonymy level; so, subjects of related meaning will be binded together.

The objectives

The objectives of this new ontology are:

  • A reference umbrella subject binding ontology, with its own pool of high-level binding subjects
  • Lightweight mechanisms for binding subject-specific ontologies to this structure
  • A standard listing of subjects that can be refererenced by resources described by other ontologies (e.g., dc:subject)
  • Provision of a registration and look-up service for finding appropriate subject ontologies
  • Identification of existing data sets for high-level subject extraction
  • Codification of high-level subject structure extraction techniques
  • Identification and collation of tools to work with this subject structure, and
  • A public Web site for related information, collaboration and project coordination.

Main applications

Given these objectives, I see a couple of main applications where such ontology could be used:

  • Helping systems to find data sources for a given ontology. UMBEL is much more than a subject structure. In fact, UMBEL will bind subjects, with related ontologies and data sources for these related ontologies. So, for a given subject, people will be able to find related ontologies, and then related data sources.
  • Acting as a subject reference backbone. So, it could be use by people to links resources, using dc:subject, to its subject resource (the UMBEL subject proxy resource), etc.
  • Could be used by user interface to help them with handling subjects (keywords) references to find related ontologies (that have the power to describe these subjects).
  • Eventually it should be used by PingtheSemanticWeb to bind pinged data to the subject reference structure.
  • And probably many others.

Creation of the Ontology

A procedure will be created to automatically generate the ontology. The gross idea is to reuse existing knowledge bases to create the set of subjects, and their relationship, that will create the ontology. So, the idea is to come up with a representative, not too general, not too specialized, set of subjects. For that, we will play with knowledge bases such as WordNet, Wikipedia, Dmoz, etc. We will try to find out how we could prune unnecessary subjects out of them, how we could create such a subject reference framework by taking a look at the intersection of each data set, etc. The procedure is not yet developed, but the first experiments will look like that.

As explained in the draft:

The acceptance of the actual subjects and their structure is one key to the acceptance — and thus use and usefulness — of the UMBEL ontology. (The other key is simplicity and ease-of-use or tools.) A suitable subject structure must be adaptable and self-defining. It should reflect expressions of actual social usage and practice, which of course changes over time as knowledge increases and technologies evolve.

A premise of the UMBEL project is that suitable subject content and structures already exist within widely embraced knowledge bases. A further premise is that the ongoing use of these popular knowledge bases will enable them to grow and evolve as societal needs
and practices grow and evolve.

The major starting point for the core subject pool is WordNet. It is universally accepted, has complete noun and class coverage, has an excellent set of synonyms, and has frequency statistics. It also has data regarding hierarchies and relationships useful to the UMBEL look-up reference structure, the ‘unofficial’ complement to the core ontology.

A second obvious foundation to building a subject structure is Wikipedia. Wikipedia’s topic coverage has been built entirely from the bottom up by 75,000 active contributors writing articles on nearly 1.8 million subjects in English alone, with versions in other
degrees of completeness for about 100 different languages. There is also a wealth of internal structure within Wikipedia’s templates.

These efforts suggest a starting formula for the UMBEL project of W + W + S + ? (for WordNet + Wikipedia + SKOS + other?). Other potential data sets with rich subject coverage include existing library classification systems, upper-level ontologies such as SUMO, Proton or DOLCE, the contributor-built Open Directory Project, subject ‘primitives’ in other languages such as Chinese, or the other sources listed in Appendix 2 – Candidate Subject Data Sets.

Though the choice of the contributing data sets from which the UMBEL subject structure is to be built will never be unanimous, using sources that have already been largely selected by large portions of the Web-using public will go a long ways to establishing authoritativeness. Moreover, since the subject structure is only intended as a lightweight reference — and not a complete closed-world definition — the UMBEL project is also setting realistic thresholds for acceptance.

Conclusion

If you are interested in such an ontology project, please join us on the mailing list of the ontology’s development group, ask questions, writes comments and suggestions.

Next step is to start creating a first version of the subject proxies.