Search Results for 'umbel'

When Linked Data Rules Fail

Print This Post Print This Post

Image Source: www.adhd-mindbydesign.com

High Visibility Problems with NYT, data.gov Show Need for Better
Practices

When I say, “shot”, what do you think of? A flu shot? A shot of whisky? A moon shot? A gun shot? What if I add the term “bank”? Do you now think of someone being shot in an armed robbery of a local bank or similar?

And, now, what if I add a reference to say, The Hustler, or Minnesota Fats, or “Fast Eddie” Felson? Do you now see the connection to a pressure-packed banked pool shot in some smoky bar room?

As humans we need context to make connections and remove ambiguity. For machines, with their limited reasoning and inference engines, context and accurate connections are even more important.

Over the past few weeks we have seen announcements of two large and high-visibility linked data

projects:  One, a first release of references for articles concerning about 5,000 people from the New York Times at data.nytimes.com; and Two, a massive exposure of 5 billion triples from data.gov datasets provided by the Tetherless World Constellation (TWC) at Rennselaer Polytechnic Institute (RPI).

On various grounds from licensing to data characterization and to creating linked data for its own sake, some prominent commentators have weighed in on what is good and what is not so good with these datasets. One of us, Mike, commented about a week ago that “we have now moved beyond ‘proof of concept’ to
the need for actual useful data of trustworthy provenance and proper mapping and characterization. Recent efforts are a disappointment that no enterprise would or could rely upon.”

Reactions to that posting and continued discussion on various mailing lists warrant a more precise dissection of what is wrong and still needs to be done with these datasets [1].

Berners-Lee’s Four Linked Data “Rules”

It is useful, then, to return to first principles, namely the original four “rules” posed by Tim Berners-Lee in his design note on linked data [2]:

  1. Use URIs as names for things
  2. Use HTTP URIs so that people can look up those names
  3. When someone looks up a URI, provide useful information, using thestandards (RDF, SPARQL)
  4. Include links to other URIs so that they can discover more things.

The first two rules are definitional to the idea of linked data. They cement the basis of linked data in the Web, and are not at issue with either of the two linked data projects that are the subject of this posting.

However, it is the lack of specifics and guidance in the last two rules where the breakdowns occur. Both the NYT and the RPI datasets suffer from a lack of “providing useful information” (Rule #3). And, the nature of the links in Rule #4 is a real problem for the NYT dataset.

What Constitutes “Useful Information”?

The Wikipedia entry on linked data expands on “useful information” by augmenting the original rule with the parenthetical clause, ” (i.e., a structured description — metadata).” But even that expansion is insufficient.

Fundamentally, what are we talking about with linked data? Well, we are talking about instances that are characterized by one or more attributes. Those instances exist within contexts of various natures. And, those contexts may relate to other existing contexts.

We can break this problem description down into three parts:

  • A vocabulary that defines the nature of the instances and their descriptive attributes
  • A schema of some nature that describes the structural relationships amongst instances and their characteristics, and, optimally,
  • A mapping to existing external schema or constructs that help place the data into context.

At minimum, ANY dataset exposed as linked data needs to be described by a vocabulary. Both the NYT and RPI datasets fail on this score, as we elaborate below. Better practice is to also provide a schema of relationships in which to embed each instance record. And, best practice is to also map those structures to external schema.

Lacking this “useful information”, especially a defining vocabulary, we cannot begin to understand whether our instances deal with drinks, bank robberies or pool shots. This lack, in essence, makes the information worthless, even though available via URL.

The data.gov (RPI) Case

With the support of NSF and various grant funding, RPI has set up the
Data-Gov Wiki [3], which is in the process of converting the datasets on data.gov to RDF,placing them into a semantic wiki to enable comment and annotation, and providing that data as RSS feeds. Other demos are also being placed on the site.

As of the date of this posting, the site had a catalog of 116 datasets from the 800 or so available on data.gov, leading to these statistics:

  • 459,412,419 table entries
  • 5,074,932,510 triples, and
  • 7,564 properties (or attributes).

We’ll take one of these datasets, #319, and look a bit closer at it:

Wiki Title Agency Name data.gov Link No Properties No Triples RDF File
Dataset 319 Consumer Expenditure Survey Department of Labor LABOR-STAT http://www.data.gov/details/319 22 1,583,236 http://data-gov.tw.rpi.edu/raw/319/index.rdf

This report was picked solely because it had a small number of attributes (properties), and is thus easier to screen capture. The summary report on the wiki is shown by this page:


Data-gov-Wiki Dataset #319

(click to expand)

So, we see that this specific dataset contains about 22 of the nearly 8,000 attributes across all datasets.

When we click on one of these attribute names, we are then taken to a specific wiki page that only reiterates its label. There is no definition or explanation.

When we inspect this page further we see that, other than the broad characterization of the dataset itself (the bulk of the page), we see at the bottom 22 undefined attributes with labels such as item code, periodicity code, seasonal, and the like. These attributes are the real structural basis for the data in this dataset.

But, what does all of this mean???

To gain a clue, now let’s go to the source data.gov site for this dataset (#319). Here is how that report looks:


Data.gov Dataset #319

(click to expand)

Contained within this report we see a listing for additional metadata. This link tells us about the various data fields contained in this dataset; we see many of these attributes are “codes” to various data categories.

Probing further into the dataset’s technical documentation, we see that there is indeed a rich structure underneath this report, again provided
via various code lookups. There are codes for geography, seasonality (adjusted or not), consumer demographic profiles and a variety of consumption categories. (See, for example, the link to this glossary page.) These are the keys to understanding the actual values within this dataset.

For example, one major dimension of the data is captured by the attribute item_code. The survey breaks down consumption expenditures within the broad categories of  Food, Housing, Apparel and Services, Transportation, Health Care, Entertainment, and Other. Within a category, there is also a rich structural breakdown. For xample, expenditures for Bakery Products within Food is given a code of FHC2.

But, nowhere are these codes defined or unlocked in the RDF datasets. This absence is true for virtually all of the datasets exposed on this wiki.

So, for literally billions of triples, and 8,000 attributes, we have ABSOLUTELY NO INFORMATION ABOUT WHAT THE DATA CONTAINS OTHER THAN A PROPERTY LABEL. There is much,much rich value here in data.gov, but all of it remains locked up and hidden.

The sad truth about this data release is that it provides absolutely no value in its current form. We lack the keys to unlock the value.

To be sure, early essential spade work has been done here to begin putting in place the conversion infrastructure for moving text files, spreadsheets and the like to an RDF form. This is yeoman work important to ultimate access. But, until a vocabulary is published that defines the attributes and their codes so we can unlock this value, it will remain hidden. And only when its further value (by connecting attributes and relations across datasets) through a schema of some nature is also published, the real value from connecting the dots will also remain hidden.The Hustler

These datasets may meet the partial conditions of providing clickable URLs, but the crucial “useful information” as to what any of this data means is absent.

Every single dataset on data.gov has supporting references to text files, PDFs, Web pages or the like that describe the nature of the data within each dataset. Until that information is exposed and made usable, we have no linked data.

Until ontologies get created from these technical documents, the value of these data instances remain locked up, and no value can be created from having these datasets expressed in RDF.

The devil lies in the details. The essential hard work has not yet begun.

The NYT Case

Though at a much smaller scale with many fewer attributes, the NYT dataset suffers from the same failing: it too lacks a vocabulary.

So, let’s take the case of one of the lead actors in The Hustler, Paul Newman, who played the role of “Fast Eddie” Felson. Here is the NYT record for the “person” Paul
Newman
(which they also refer to as http://data.nytimes.com/newman_paul_per). Note the header title of Newman, Paul:


NYT 'Paul Newman Articles' Record

(click to expand)

Click on any of the internal labels used by the NYT for its own attributes (such as nyt:first_use), and you will be given this message:

“An RDFS description and English language documentation for the NYT namespace will be provided soon. Thanks for your patience.”

We again have no idea what is meant by all of this data except for the labels used for its attributes. In this case for nyt:first_use we have a value of “2001-03-18″.

Hello? What? What is a “first use” for a “Paul Newman” of “2001-03-18″???

The NYT put the cart before the horse: even if minimal, they should have released their ontology first — or at least at the same time — as they released their data instances. (See further this discussion about how an ontology creation workflow can be incremental by starting simple and then upgrading as needed.)

Links to Other Things

Since there really are no links to other things on the Data-Gov Wiki, our focus in this section continues with the NYT dataset using our same example.

We now are in the territory of the fourth “rule” of linked data: 4. Include links to other URIs so that they can discover more things.

This will seem a bit basic at first, but before we can talk about linking to other things, we first need to understand and define the starting “thing” to which we are linking.

What is a “Newman, Paul” Thing?

Of course, without its own vocabulary, we are left to deduce what this thing “Newman, Paul“ is that is shown in the previous screen shot. Our first clue comes from the statement that it is of rdf:type SKOS concept. By looking to the SKOS vocabulary, we see that concept is a class and is defined as:

A SKOS concept can be viewed as an idea or notion; a unit of thought. However, what constitutes a unit of thought is subjective, and this
definition is meant to be suggestive, rather than restrictive. The notion of a SKOS concept is useful when describing the conceptual or intellectual structure of a knowledge organization system, and when referring to specific ideas or meanings established within a KOS.

We also see that this instance is given a foaf:primaryTopic of Paul Newman.

So, we can deduce so far that this instance is about the concept or idea of Paul Newman. Now, looking to the attributes of this instance — that is the defining properties provided by the NYT — we see the properties of nyt:associated_article_count, nyt:first_use, nyt:last_use and nyt:topicPage. Completing our deductions, and in the absence of its own vocabulary, we can now define this concept instance somewhat as follows:

New York Times articles in the period 2001 to 2009 having as their primary topic the actor Paul Newman

(BTW, across all records in this dataset, we could see what the earliest first use was to better deduce the time period over which these articles have been assembled, but that has not been done.)

We also would re-title this instance more akin to “2001-2009 NYT Articles with a Primary Topic of Paul Newman” or some such and use URIs more akin to this usage.

sameAs Woes

Thus, in order to make links or connections with other data, it is essential to understand what the nature is of the subject “thing” at hand. There is much confusion about actual “things” and the references to “things” and what is the nature of a “thing” within the literature and on mailing lists.

Our belief and usage in matters of the semantic Web is that all “things” we deal with are a reference to whatever the “true”, actual thing is. The question then becomes:  What is the nature (or scope) of this referent?

There are actually quite easy ways to determine this nature. First, look to one or more instance examples of the “thing” being referred to. In our case above, we have the “Newman, Paul” instance record. Then, look to the properties (or attributes) the publisher of that record has used to describe that thing. Again, in the case above, we have nyt:associated_article_count, nyt:first_use, nyt:last_use and nyt:topicPage.

Clearly, this instance record — that is, its nature — deals with articles or groups of articles. The relation to Paul Newman occurs as a basis of
the primary topic of these articles, and not a person basis for which to describe the instance. If the nature of the instance was indeed the person Paul Newman, then the attributes of the record would more properly be related to “person” properties such as age, sex, birthdate, death date, marital status, etc.

This confusion by NYT as to the nature of the “things” they are describing then leads to some very serious errors. By confusing the topic (Paul Newman) of a record with the nature of that record (articles about topics), NYT next misuses one of the most powerful semantic Web predicates available, owl:sameAs.

By asserting in the “Newman, Paul” record that the instance has a sameAs relationship with external records in Freebase and DBpedia, the NYT both entails that properties from any of the associated records are shared and infers a chain of other types to describe the record. More precisely, the NYT is asserting that the “thing” referred to by these instances are identical resources.

Thus, by the sameAs statements in the “Newman, Paul” record, the NYT is also asserting that that record is an instance of all these classes:

Furthermore, because of its strong, reciprocal entailments, the owl:sameAs assertion would also now entail that the person Paul Newman has the nyt:first_use and nyt:last_use attributes, clearly illogical for a “person” thing.

This connection is clearly wrong in both directions. Articles are not persons and don’t have marital status; and persons do not have first_uses. By misapplying this sameAs linkage relationship, we have screwed things up in every which way. And the error began with misunderstanding what kinds of “things” our data is about.

Some Options

However, there are solutions. First, the sameAs assertions, at least involving these external resources, should be dropped.

Second, if linkages are still desired, a vocabulary such as UMBEL [4] could be used to make an assertion between such a concept, and these other related resources. So, even though these resources are not the same, they are closely related. The UMBEL ontology helps us to define this kind of relation between related, but non-identical, resources.

Instead of using the owl:sameAs

property, we would suggest the usage of the umbel:linksEntity, which links a skos:Concept to related named entities resources. Additionally, Freebase, which also currently asserts a sameAs relationship to the NYT resource, could use the umbel:isAbout relationship to assert that their resource “is about” a certain concept, which is the one defined by the NYT.

Alternatively, still other external vocabularies that more precisely capture the intent of the NYT publishers could be found, or the NYT editors could define their own properties specifically addressing their unique linkage interests.

Other Minor Issues

As a couple of additional, minor suggestions for the NYT dataset, we would suggest:

  • Create a foaf:Organization description of the NYT organization, then use it with dc:creator and dcterms:rightsHolder rather than using a literal, and
  • The dual URIs such as “http://data.nytimes.com/N31738445835662083893” and “http://data.nytimes.com/newman_paul_per” are not wrong in themselves, but the purpose is hard to understand. Why does a single organization need to create multiple resources for the identical resource, when it comes from the same system and has the same purpose?

Re-visiting the Linkage “Rule”

There are very valuable benefits from entailment, inference and logic to be gained from linking resources. However, if the nature of the “things” being linked — or the properties that define these linkages — are incorrect, then very wrong logical implications result. Great care and understanding should be applied to linkage assertions.

In the End, the Challenge is Not Linked Data, but Connected Data

Our critical comments are not meant to be disrespectful and are not being picky. The NYT and TWC are prominent institutions for which we should expect leadership on these issues. Our criticisms (and we believe those of others) are also not an expression of a “trough of disillusionment” as some have been pointing out.

This posting is about poor practices, pure and simple. The time to correct them is now. If asked, we would be pleased to help either institution establish exemplar practices. This is not automatic, and it is not always easy. The data.gov datasets, in particular, will require much time and effort to get right. There is much documentation that needs to be transitioned and expressed in semantic Web formats.

In a broader sense, we also seem to lack a definition of best practices related to vocabularies, schema and mappings. The Berners-Lee rules are imprecise and insufficient as is. Prior best guidance documents tend to
be more how to publish and make URIs linkable, than to properly characterize, describe and connect the data.

Perhaps, in part, this is a bit of a semantics issue. The challenge is not the mechanics of linking data, but the meaning and basis for connecting that data. Connections require logic and rationality sufficient to reliably inform inference and rule-based engines. It also needs to pass the sniff test as we “follow our nose” by clicking the links exposed by the data.

It is exciting to see high-quality content such as from national governments and major publishers like the New York Times begin to be exposed as linked data. When this content finally gets embedded into usable contexts, we should see manifest uses and benefits emerge. We hope both institutions take our criticisms in that spirit.

This posting has been jointly authored by Mike Bergman and Fred Giasson and simultaneously published on both of their blogs, hoping to draw more attention to the need for better practices in publishing linked data.

[1] The NYT has been updated with improvements and they fixed multiple issues from the first release. The
problems listed herein, however, still pertain after these improvements.
[2] Tim Berners-Lee, 2006. Linked Data (Design Issues), first posted on 2006-07-27; last updated on
2009-06-18. See http://www.w3.org/DesignIssues/LinkedData.html. Berners-Lee refers to the steps above as “rules,” but he elaborates they are expectations of behavior. Most later citations refer to these as “principles.”
[3] Li Ding, Dominic DiFranzo, Sarah Magidson, Deborah L. McGuinness and Jim Hendler, 2009. Data-GovWiki: Towards Linked Government Data. See
http://www.cs.vu.nl/~pmika/swc/documents/Data-gov%20Wiki-data-gov-wiki-v1.pdf.
[4] UMBEL (Upper Mapping and Binding Exchange Layer) is a lightweight ontology structure in development for relating Web content and data to a standard set of subject concepts. It purpose has resulted in its creation of an associated vocabulary geared to both class-instance and reciprocal relationships, as well as partial or likelihood relationships. See http://umbel.org/technical_documentation.html#vocabulary.

A New Home for UMBEL Web Services

Print This Post Print This Post

umbel_wsEight months ago we announced the dissolution of Zitgist LLC. This event led to the creation of a sandbox to keep alive all the online assets of the company. Since this sandbox server was not owned by Structured Dynamics, it was becoming hard for us to update UMBEL and its online services. It is why we took the time to move the services back on to our new servers.

A New Home

sd_logo_260Structured Dynamics LLC now hosts a new version for the UMBEL Web services. From the main menu at the SD Web site you can access these services under the “umbel ws” menu option (you can also bookmark the Web services site at umbel.structureddynamics.com or ws.umbel.org.)

This move of UMBEL’s Web services to a new home will make the future upgrade of UMBEL easier, and this will make the maintenance of the Web services endpoints easier as well. With this move, I am pleased to announce the release of five initial Web services and one visualization tool:

Lookup Web Services:

Inference Engine Web Services:

SPARQL endpoint Web Service:

Visual Tool:

Note that the visual tool is using Moritz Stefaner’s Relation Browser.


Ping the Semantic Web

ptswlogo160.gifAdditionally, the Ping the Semantic Web RDF pinging service is now the property of OpenLink Software Inc. OpenLink is now hosting, maintaining and developing the service.

New release of UMBEL: v072

Print This Post Print This Post

umbel_medium.pngI am pleased to announce that we resumed our work with UMBEL. We just released the version v0.72, which is based on the OpenCyc version 2009-01-31. This new version is intermediary and has been created mostly to check the evolution of OpenCyc vis-ŕ-vis UMBEL. Within the next month or so, we will release a new version (v.080), which will introduce a major new concept that should help systems and users manipulating the entire UMBEL Subject Concepts structure.

For them who want to know what changed between versions v071 and v072, here is CVS file that list all the changes between the versions. There are four columns: (1) source node, (2) attribute, (3) target node and (4) version number. This file list all triples that are present in a version, but not in the other. So, you have all changes (nodes & arcs) between the two versions. Mostly all the changes come from internal changes to OpenCyc. We did fix a couple of things such as removing cycles in the graph, etc. But 99% of the changes come from changes within OpenCyc.

Finally note that the web services endpoints will be updated with this new version of UMBEL subject concepts in the coming week along with the dereferencing of their URIs. Stay tuned!

The Next Bibliographic Ontology: OWL

Print This Post Print This Post

The Bibliographic Ontology’s aim is to be expressive and flexible enough to be able to convert any existing bibliographic legacy schema (such as Bibtex and its extensions, MARC, Elsevier’s SDOS & CITADEL citation schemas, etc.) and RDFS/OWL ontologies to it.

This new BIBO version 1.2 is the result of more than one year of thinking and discussions between 101 community members and 1254 mail messages. The project’s first aim of expressiveness and flexibility is nearly reached. BIBO’s ongoing development is now pointing to a series of methods and best practices for mature ontology development.

Some BIBO mappings between legacy schemas have been developed, but this trend will now be accelerated. More people are getting interested in BIBO’s ability to describe bibliographic resources. Some people are interested in it to describe bibliographic citations; others are interested in it to integrate data from different bibliographic data sources, using different schemas, into a single and normalized data source. This single data source (in RDF) can then become easily queried, managed and published. Finally, other people are interested in it as a standard agreed to by an open community, that helps them to describe bibliographic data that aims to be published and consumed by different kind of data consumers (such as standalone software like Zotero; or such as citation aggregation Web services like Scirus or Connotea).

With this BIBO 1.2 release, much has changed and been improved. Now, it is time for the community to start implementing BIBO in different systems; to create more mappings; and to complete more converters.

Design Redux

As you may recall from its early definition, BIBO has been designed for both: (1) a core system with extensions relevant to specific domains and uses, and (2) a collaborative development environment governed by the community process.

These design imperatives have guided much of what we have done in this new version 1.2 release to aid these objectives.

BIBO in OWL 2

The new version of BIBO is now described using OWL 2. In the next sections you will know why we choose to use OWL 2 as the way to describe BIBO in the future. However, saying that it is OWL 2 doesn’t mean that it becomes incompatible with everything else that exists. In fact, it validates OWL 1.1 and its DL expressivity is SHOIN(D); this means that fundamentally nothing has changed, but that we are now leveraging a couple of new tools and concepts that are introduced by OWL 2.

As you will see below this decision results in much more than a single update of the ontology. We are introducing an updated, and more efficient, architecture to develop open source ontologies such as The Bibliographic Ontology.

New Versioning System

OWL 2 is introducing a new versioning and importation system for OWL ontologies. This feature alone strongly argued for the adoption of OWL 2 as the way to develop BIBO in the future.

This new versioning system consists of two things: an ontologyURI and a versionURI. The heuristics to define, check, and cache an ontology that as an ontologyURI and possibly a versionURI are described here.

BIBO has an ontologyURI and multiple versionURIs such as http://purl.org/ontology/bibo/1.0/, http://purl.org/ontology/bibo/1.1/, and http://purl.org/ontology/bibo/1.2/.

Right now, the current version of the ontology is 1.2. This means that the current version of BIBO will be located at two places: http://purl.org/ontology/bibo/ and http://purl.org/ontology/bibo/1.2/.

The location logic of ontologies is described here. What we have to take care here is that if someone dereferences any class or properties of BIBO, it will always get the description of that class or property from the latest version of the ontology. This is why the caching logic is quite important. The user agent has to make sure that it caches the version of the ontology that it knows.

What is really important to understand is that the URI of the ontology won’t change over time when we introduce new versions of the same ontology. Only the location of these versions will change.

Finally, the OWL 2 mapping to RDF document tells us that we have to use the owl:versionInfo OWL property to define the versionURI of an ontology. This is the reason why the use of this OWL 2 versioning system doesn’t affect the validity of BIBO as a OWL 1.1 ontology; because owl:versionInfo is also an OWL 1.1 property.

Now, lets take look at the tools that we will use to continue the development of BIBO.

Protégé 4 for Developing BIBO

We chose to now rely on Protégé 4 to develop BIBO in the future. We wanted to start using a tool that would help the community to develop the ontology. Considering that Protégé 4 Beta has been released in August; that it supports OWL 2 by using the OWLAPI library; and many plugins are already supported; it makes it the best free and open-source option available.

What I have done is to add some SKOS annotation properties to annotate the BIBO classes and properties to help us to edit and comment on the ontology. Here is the list of new annotation properties we introduced:

  • skos:note, is used to write a general notes
  • skos:historyNote, is used to write some historical comments
  • skos:scopeNote, is really important. It is the new way to target the classes and properties, imported from external ontologies, that we recommend to use to describe one aspect of BIBO. The scopeNote will tell the users the expected usage for these external resources.
  • skos:example, is used to give some examples that show how to use a given class or property. Think of RDF/XML or RDF/N3 code examples.

Finally, all these annotations are included in BIBO’s namespace.

OWLDoc for Generating Documentation

OWLDoc is a plugin for Protégé that generates documentation for OWL ontologies. In a single click, we can now get the complete documentation of an ontology. This makes the generation of the documentation for an ontology much, much, more efficient. Users can easily see which ontologies are imported, and then they can easily browse the structure of the ontology. Many facets of the ontology can be explored: all the imported ontologies, the classes, the object/data properties, the individuals, etc.

You can have a look at the new documentation page for BIBO here. On the top-left corner you have a list of all imported ontologies. Then you can click on facet links to display related classes, properties or individuals. Then you may read the description of each of these resources, their usage, and their annotations (scope-notes, Etc.).

Please note there are still some issues and improvements to do with the template used to generate the pages, such as multiple resource descriptions not yet adequately distinguished. We are in the process of cleaning up these minor issues. But, all-in-all, this is a major update to the workflow since any user can easily re-create the documentation pages.

Collaborative Protégé for Community Development

Now that it is available for Protégé 4, we will shortly setup a Protégé server and make it available to the community to support BIBO’s community development. We will shortly announce the availability of this Collaborative Protégé.

In the meantime, I suggest to use the file “bibo.xml” from the “trunk” branch of the SVN repository (see Google Code below). The Bibliographic Ontology can easily be opened that way using the “Open…” option to open the local file of the SVN folder, or by using the “Open URI…” option to open the bibo.xml file from the Google Code servers. That way, each modification to the ontology can easily be committed to the SVN instance.

Google Code to Track Development

As noted above, the BIBO Google Code SVN is used to keep track of the evolution of the ontology. All modifications are tracked and can easily be recovered. This is probably one of the most important features for such a collaborative ontology development effort.

But this is not the only use of this SVN repository. In fact, it as an even more central role: it is the SVN repository that sends the description of the ontology for any location query, by any user, for any version. Below we will see the workflow of a user query that leads the SVN repository to send back a description for the ontology.

Google Groups to Discuss Changes

The best tool to discuss ontology development is certainly a mailing list. A Google Groups is an easy way to create and manage an ontology development mailing list. It is also a good way to archive and search discussions that has an impact on the development (and the history) of the ontology.

Purl.org to Access the Ontology

Another important piece of the puzzle is to have a permanent URI for an ontology that is hosted by an independent organization. That way, even if anything happens with the ontology development group, hopefully, the URI will remain the same over time.

This is what Purl.org is about. It adds one more step to the querying workflow (as you will notice in the querying schema bellow), but this additional step is worth it.

General Query Workflow

There is one remaining thing that I have to talk about: the general querying workflow. I have been talking about the new OWL 2 versioning system, purl.org redirection and using the SVN repository to deliver ontology descriptions. So, there is what the workflow looks like:

[clik to enlarge this schema]

At the first step, the user requests the rdf+xml http://purl.org/ontology/bibo/. As we discussed above, this permanent URI is hosted by Purl.org; what this service does is to redirect the user to the location of the content negotiation script.

At the second step, the user requests the rdf+xml serialization of the description of the ontology at the URI of the location sent by the Purl.org server: http://conneg.com/script/. One of the challenges we have with this architecture is that neither Purl.org nor Google Code handles content negotiation with a user.

Thus, it is also necessary to create a “middle-man” content negotiation script that performs the content negotiation with the user, and redirects it to the proper file hosted on SVN repository. (If Purl.org or the SVN repository could handle the content negotiation part of the workflow, we could then remove the step #2 from the schema above and then improve the general architecture.  However, for the present, this step is necessary.)

Note 1: Take a special look at the redirection location sent back by the content negotiation script: http://…/tags/1.2/bibo.xml. This is a direct cause of the new versioning has the versionURI http://purl.org/ontology/bibo/1.2/. Considering the versioning system, the content negotiation script redirects the user to the description of the latest version of the ontologyURI (which is currently the version 1.2).

Note 2: Purl.org current doesn’t strictly conform with the TAG resolution on httpRange-14. However this should be resolved in an upgrade of the Purl.org system that is underway (the current system is dated as of the early 1990s).

At the third step, the SVN repository returns the requested document by the user with the proper Content-Type.

Conclusion

Developing open source ontologies is not an easy task. Development is made difficult considering the complexity of some ontologies, considering the different way to describe the same thing and considering the level of community involvement needed.  Thus, open source ontology development needs the proper development architecture to succeed.

I have had the good fortune to work on the this kind of ontology development with Yves Raimond on the Music Ontology, with Bruce D’Arcus on the Bibliographic Ontology, and with Mike Bergman on UMBEL. Each of these projects has led to an improvement of this architecture. After two years, these are the latest tools and methods I can now personally recommend to use to collectively create, develop and maintain ontologies.

UMBEL Web Services Endpoints Released

Print This Post Print This Post
After some delay, we are pleased to finally release the UMBEL Web services endpoints to the public. We have re-organized the Web services we introduced three months ago to add coherency and flexibility to the model.

The goal remains the same, but with a different flavor: these tools let ontologists and Web developers search, discover and use the UMBEL subject concept and named entity structures. The added flavor is that these Web services now fully embrace the HTTP 1.1 protocol and are provided via a series of well established data and serialization formats.

We now have RESTful Web services to add to our RESTful linked data. Pretty cool combination!

We are introducing two kinds of Web services: (1) atomic Web services and (2) compound Web services. An atomic Web service only performs one action: It takes some inputs and then outputs a resultset of the action. A compound Web service takes multiple atomic Web services, plugs them together in a pipeline model, and then takes some inputs and outputs a resultset arising from the compound action.

The communication between each of these Web service instances and the external World is the same: communication is governed by the HTTP 1.1 protocol. HTTP is generally RESTful and used to establish the communication, to determine mime type and serialization, to get inputs, to return status of the communication and possible errors, and to send back the resultset of the computation of the Web service.

That way, we can easily, within hours, programmatically pipeline these atomic Web services together to create new Web services. We can integrate external Web services endpoints into the same pipeline without modifying anything to the architecture. Status, errors and resultsets are propagated along the line, directly to the data consumer. This is the flexibility part of the story.

Now, how cool is that?

Overview of the UMBEL Web Services Endpoints

We are today releasing a couple of these atomic and compound Web service endpoints to the public, but others will follow in the coming weeks and months. Four families of Web services have been released that total seven Web service endpoints:

If you don’t know what UMBEL is, I would suggest you read a background information page that talks about the project.

The most important reading related to this blog post is the API philosophy documentation page that talks about the details of the design of this Web services architecture.

For Web developers that want to integrate these Web services endpoints within their application, an API documentation page explains how to communicate with these endpoints for each of the services.

Example of an Atomic Web Service

The Inference: Lister Web service is a good example of an atomic Web service. It takes a subject concept URI as the input and outputs a series of super-class-of, sub-class-of or equivalent-class-of classes for that concept. As an atomic service it does one thing and one thing only: Inferring relationships of a given subject concept URI.

Example of a Compound Web Service

The Reporter: Named Entity Web service is a good example of a compound Web service. This Web service displays full of information about a UMBEL named entity URI. However, not all the information returned by this Web service is directly computed by it. In fact, the information about broader and equivalent classes and subject concepts come from the Inference: Lister Web service. Results coming from this Web service are immediately integrated in the Reporter’s resultset. This is easily done considering that they share the same communication language (HTTP 1.1) and the same data and serialization formats (XML, RDF+XML and RDF+N3). This flexibility is priceless to quickly create resourceful compound Web services.

Conclusion

After some months to get the design right, we have finally released some of the UMBEL Web services to the public. These Web services can easily be integrated in current software architectures to leverage UMBEL’s vision of the World. The architecture underlying what we have released today will help to easily integrate UMBEL’s principles and concepts within new and existing projects. This will ultimately help people to quickly react to the changing World of needs and expectations of data users and consumers.

I hope you will enjoy using these new Web services, which Zitgist is freely hosting. The data you get from the Web service is open data and can be used freely with attribution.

Please do report any issues you may encounter. We also welcome any advice or suggestions that you would care to provide to enhance the overall system.

Exploding DBpedia’s Domain using UMBEL

Print This Post Print This Post

A couple of challenges I have found with DBpedia is that it is hard for a system to interact with the dataset and it is hard to figure out how to interpret information instantiated in it. It is hard to know what properties are used to describe individuals; and hard to know what the classes refer to. It is also hard for standalone and agent software to understand the nature of the individuals that are instantiated by DBpedia because the classes they belong to are generally unknown or poorly defined.

In the following blog post I suggest to use a method known as “exploding the domain” to try to overcome these difficulties of using and understanding DBpedia. This adds still further usefulness to DBpedia’s considerable value. This demonstration is based on the UMBEL subject concept structure.

As I will demonstrate below, this method consists of contextualizing classes in a coherent framework to explode their domains. By exploding the domain of a class, we link it to other classes that are defined by external ontologies. By exploding the domain of a class by linking it to externally defined classes, we also help standalone and agent software to understand the meaning for that class (at least if they understand the meaning of the classes that have been linked to it). Note that we are able to explode the domains by linking classes using only three properties: rdfs:subClassOf, owl:equivalentClass and umbel:isAligned.

First of all, let me give some background information about how DBpedia individuals and UMBEL named entities have been created, and how both datasets have been linked together.

How DBpedia individuals are instantiated

DBpedia is a dataset that is based on the well known Wikipedia encyclopedia. Basically DBpedia creates one individual for each Wikipedia page. Most of the individuals that are instantiated in this way are what we call a “named entity” in UMBEL’s parlance.

But to be instantiated, an individual has to belong to a class. DBpedia chooses to use Yago’s classification system (that is based on WordNet) to instantiate those DBpedia individuals. This means that all DBpedia individuals belong to at least (theoretically) one Yago class. This means that all DBpedia individuals are instances of Yago classes (and in some rarer cases, they are also instances of classes defined in external ontologies).

How UMBEL named entities have been created

For its part, UMBEL’s named entities dictionaries come from different data sources. Currently, most all public UMBEL named entities also come from Yago (example: Aristotle), but many also come from the DBTune dataset (example: Pete Baron) or others. (UMBEL’s design allows more named entities to be plugged into the system as additional dictionaries at will.)

However, unlike DBpedia, we do not use Yago’s classification system to instantiate these named entities. And unlike Yago, we do not use the WordNet classes to instantiate the named entities either.

The current UMBEL subject concept structure is based on OpenCyc. This means that the relations between the classes that instantiate the UMBEL named entities come from the Cyc knowledge base.

So while we use Yago’s named entities (from Wikipedia) as a starting basis, we instantiate them using the UMBEL subject concept classes instead of the WordNet classes. So, basically, we have switched the WordNet conceptual framework for the UMBEL (or OpenCyc) one.

But, how did we create these UMBEL named entities, instantiated using UMBEL subject concept classes and based on Yago? Here is the linkage path:

Yago classes –> WordNet synsets <– Cyc collections <– OpenCyc classes <– UMBEL subject concept classes

Et voilŕ !

How UMBEL named entities are linked to DBpedia individuals

OK, so now how do we link UMBEL named entities to DBpedia individuals? It is simple. Remember that DBpedia individuals have been created from Wikipedia pages. Also remember that Yago individuals come from the same Wikipedia pages. We can then make the link between the individuals from DBpedia and the individuals from Yago based on Wikipedia URLs.

Exactly the same logic applies for linking DBpedia individuals to UMBEL named entities.

The end result of this linkage is that we have UMBEL named entities that are the same as DBpedia individuals. The difference is that the UMBEL named entities are now instances of UMBEL subject concepts: a totally different conceptual structure.

Remember that these named entities are contextualized in a coherent conceptual framework. And this characteristic means a lot for what is yet to come.

Web services to search and visualize these named entities

We created two new web services on the UMBEL web services home page (the user interface to these web services; the endpoints will be released later) to help people interact with these named entities:

  1. The “Search Named Entities Dictionaries” web service
  2. The “Named Entity Detailed Report” web service

The first web service lets you search amongst all publicly available UMBEL named entities dictionaries.

The second web service lets you visualize detailed information about any named entity.

This information page shows you the full scope of information about a named entity: which class it belongs to (subject concept classes as well as external classes); which other individuals, from other datasets, are identical to them; examples of web services that get queried with information about this named entity; etc.

Exploding the domain of Plato

Now that this background information has been established, let’s take a look at what is happening when we link DBpedia individuals to UMBEL named entities: how that actually works to explode the domain.

Let’s take the example of dbpedia:Plato. This individual is currently defined in DBpedia as:

  • yago:AncientGreekPhysicists
  • yago:PhilosophersOfLanguage
  • yago:PhilosophersOfLaw
  • yago:PoliticalPhilosophers
  • yago:AncientGreekVegetarians
  • yago:AcademicPhilosophers
  • yago:Philosopher110423589

Fine, but what does this mean? What if my system doesn’t know any of these classes? We, as humans, know that Plato is a person, a human being. But it is totally another story for a software agent.

What we want to do here is to explode Plato’s domain to try to find a meaning that my software system can understand.

In UMBEL, the “Plato” named entity is defined as an umbel:Person and an umbel:Intellectual. If you take a look at the detailed report for these two subject concepts, you will be able to see in the section “Broader Subject Concepts” the super-classes that Plato belongs to. So we know that Plato is a social being, a homo sapiens, etc. This is basically what happens with Yago too, except that the conceptual structure (the way to describe the entity) differs.

However one thing that is happening is that we exploded Plato’s domain with classes defined in external ontologies. As you can notice in the sections “Broader External Classes” and “Equivalent External Classes”, Plato is also a: foaf:Person, a foaf:Agent and a cyc:Person.

This means that if my software agent doesn’t know what a “yago:Person100007846” means; it alternatively may know what a foaf:Person or a foaf:Agent means. And if it knows what it means, then it will be able to properly manipulate it: to display it in a special way; to refer to it as a person; so to do whatever it can with information about a “person”.

This exploding the domain works because these external ontologies classes have been referentially linked to a coherent conceptual structure.

The inference path

Let’s take a look at the fundamental reasons why the scenario above works.

First, you, and your system, have to trust the UMBEL named entities dictionaries and the UMBEL subject concept structure to perform the inference that I will explain below. If you and your system trust these linkage assertions, then you will be able to act according to the knowledge that has been inferred.

DBpedia individuals are linked to UMBEL named entities using the owl:sameAs property. This means that DBpedia individual A is identical (same semantic meaning) as the UMBEL named entity B. They both refer to the same individual.

This means that if B is defined as being of rdf:type sc:Person (”sc” stands for Subject Concept), then we can infer that A is defined as being of rdf:type sc:Person too.

If sc:Person is owl:equivalentClass with foaf:Person, we can infer that umbel:B is a foaf:Person, so that dbpedia:A is a foaf:Person too!

We can see similar examples for exploding the domains:

Exploring ConceptualWorks, PeriodicalSeries and NewspaperSeries

In my “UMBEL as a Coherent Framework to Support Ontology Development” blog post from last week, I showed how UMBEL subject concepts acted to create context for linked classes defined in external ontologies. Since DBpedia individuals are instances of classes, and that some of these classes are linked to UMBEL, these subject concept classes also give context to those individuals!

As some examples, go ahead and take a look at the “Named Entities for …” section of these detailed report pages:

The partial list of named entities that are returned by the detailed report viewer shows named entities that mainly come form Wikipedia (so that have links to DBpedia). These subject concepts gives a coherent context to those DBpedia individuals.

You should quickly notice, for example, that dbpedia:Kansas_City_Times is not only a sc:NewspaperSeries, a sc:PeriodicalSeries and a sc:ConceptualWork. You also notice that it is a frbr:Work, a bibo:Periodical and a bibo:Newspaper.

The context created by these UMBEL subject concepts gives not only new power to linked external classes, but also to their instances, such as these DBpedia individuals!

Conclusion

Contexts created by UMBEL subject concepts emerge by the power of linkage that exists between all the subject concepts, and the linkage between those subject concepts classes with classes defined in external ontologies. These contexts are consistent because of the coherence of the structure that is powered by OpenCyc (Cyc).

So far, most Linked Data has been about the “things” or named entities of the world, organized according to either Wikipedia categories or WordNet. These structures may have some internal structural consistency, but were never designed to play the role as a coherent reference framework. The coherence of UMBEL (based on the coherence of Cyc) is a powerful contextual lever for bringing order to this chaos.

Once information gets linked to a coherent framework such as UMBEL, things start to happen; powerful things. And, with each new linkage and relation to additional external ontologies, that power increases exponentially.

I wrote this blog post to show again the power of exploding the domain using DBpedia as an example, and how UMBEL can help to use and to leverage such big datasets.

UMBEL as a Coherent Framework to Support Ontology Development

Print This Post Print This Post

There are multiple ways to represent the World we live in. Someone will think about something in a way, where someone else next to him will think about the same thing in another way. They will think about it in different ways: different characteristics, different ways to interact with it, different ways to use it, different ways to think about its composition, its relations with other things, and so on.

What is nice is that probably all of these different ways to think about this thing are good: after all, there are many ways to think about the same thing. It is this characteristic of thinking about things in different ways that leads to innovation.

But innovation is also not a game where anything goes. Things that work in the real world and in real ways need to adhere to certain rules, concepts, principles and theories. Continued innovation requires working within these coherent frameworks of natural relationships and order.

So, while a beautiful thing is that we can create new frameworks to think about things differently, not all of those frameworks work as well as others or make sense.

While it is conceivable that one could suppose any new framework or to think about things differently, frameworks that are actually useful should, among other things:

  1. Make sure the development of innovations within the framework is coherent
  2. Make sure the development of innovations within the framework is in context
  3. Help coordinate the development of projects and the cooperation of agents that work on these projects in order to achieve (1) and (2).

What seems clear to me is that the lack of any of (1), (2) or (3) makes innovations difficult and/or less powerful and less useful.

Why Would the Development Of Ontologies be Different?

The Semantic Web is often seen as a place where people describe things in multiple ways and where these things are more or less magically related together. For example, if you can’t properly describe something, you only have to create a new ontology, or to extend an existing one, and to publish it, et voilŕ!

The more I work in this field, the less I believe in this.

Remember my first point? People tend to think about things in different ways. The same logic applies to the development of ontologies (particularly in the development of ontologies!). Two ontologies, intended to describe the same things, can describe them in totally different ways. So, while some of the magic is that both ontologies can perfectly describe these things but only in different ways, there are other aspects that are not magical at all.

The problem here is to have at least one framework that helps people to develop ontologies such that the:

  1. Developed ontologies remain coherent
  2. Developed ontologies are in context
  3. Coordination of the development of ontologies and the cooperation of the agents working on these ontologies projects is effective to achieve goals (1) and (2).

This construct looks familiar, doesn’t it?

What I am proposing here is to use UMBEL as a coherent framework for ontology development. I am not saying that other frameworks can not play a guiding role in ontology development. But I am saying two things. First, some form of reference framework is necessary. And, second, truly useful frameworks must also be consistent and coherent.

What I am stressing here is the importance of conceptual frameworks to develop ontologies that can be used by people, companies and systems to properly and efficiently exchange data; and at some level, to reason over this data, too.

I think that the only way to do this in an efficient way is by grounding ontologies in such conceptual frameworks.

The ultimate goal is to make data exchange and data reasoning effective to people, organizations and systems that consume this sea of data. And I believe that it is not possible to achieve without grounding these efforts in a coherent, conceptual framework.

An Example at Work

Nothing is better than an example to shows the potential of UMBEL as a coherent framework to develop, and cross-link, ontologies.

Let’s take the Bibliographic Ontology as an example, which we just cross-linked to UMBEL in yesterday’s version 071 release. (Among a dozen other key ontologies; the list is getting pretty cool!)

The goal is to link BIBO classes to UMBEL subject concepts. The linkage is done using three properties: owl:equivalentClass, rdfs:subClassOf and umbel:isAligned.

But firstly, what is the goal here? We try to do two things when linking such ontologies to the UMBEL framework:

  1. To make sure the ontology (BIBO) is coherent and consistent with other existing ontologies that are linked to the framework (other such ontologies could be FOAF, SIOC, etc.)
  2. To make sure that the design choices of the developed ontology are consistent with the design choices of the framework, and the other ontologies that are linked to that framework.

Both points try to help achieve a grander vision: trying to make the semantic Web a little bit more coherent and easy to use and understand.

The BIBO Linkage

This figure shows how BIBO classes have been linked to UMBEL subject concepts in a set-like schema (click to enlarge the schema):

This schema shows what set belongs to what other set. That way, we can quickly notice that bibo:Patent is equivalent to umbel:Patent. We can also see that both classes belongs to (sub-class-of) bibo:Document, umbel:PropositionalConceptualWork and umbel:ConceptualWork, etc.

We have to keep one thing in mind that we made clear in the UMBEL technical documentation: UMBEL has its own view of the World. UMBEL’s subject concept structure is its view of the World. So these linkages are consistent within the UMBEL framework. Now, let’s continue.

The Context

Remember the three points above? What we have done here is to put BIBO in context. The context is created by the UMBEL conceptual framework. Once this is done, we can check for the coherence between BIBO, UMBEL and all the other ontologies that are linked to the framework.

The figure below shows the context created by UMBEL for BIBO, FOAF and SIOC (click to enlarge the schema):

Considering the current description of these three ontologies, we know that bibo:Document is equivalent to foaf:Document. But there exists no relationship between these two classes and sioc:Item and sioc:Post.

Intuitively we know that there are some relationships between all these classes (at least based on their label). We also have to keep in mind that it is not because a description is not defined (in RDF) that this description doesn’t exist (this is the open world assumption).

That being said, the figure above shows how UMBEL can help us to find such “non-described” relationship between classes of different ontologies. By contextualizing these three ontologies we now find that all these classes are sub-classes of umbel:ConceptualWork. We also know that some sioc:Post belongs to umbel:PropositionalConceptualWork (things written), just like some bibo:Document and foaf:Document stuff.

This means that this linkage — this contextualization — of external ontologies now gives us a common ground to play with: umbel:ConceptualWork. By querying this subject concept we can come up with a full range of related things: BIBO, SIOC and FOAF stuff.

For example, take a look at the section “Narrower External Classes” of the umbel:ConceptualWork detailed report and extend the list of external classes (click on the All Classes . . . link). All these things are conceptual works. This fact is explicated by UMBEL even if no relations, or a small number, is described in these ontologies, related to the other ontologies. Also take a look a the list for umbel:PropositionalConceptualWork.

This also shows the coherence of the design of each ontology.

The Coherence

So, once we have the context in place, we are on our way to achieve coherence. UMBEL is 100% based on OpenCyc and Cyc, which are internally consistent and coherent within themselves. We thus use these coherent frameworks to make the mappings to external ontologies coherent, too.

The equation is simple:

“a coherent framework” + “ontologies contextualized by this framework” = “more coherent ontologies”

This context and this coherence helps us to develop ontologies in two ways:

  1. It helps us to make sure the design of an ontology is good
  2. It helps us to make sure the designed ontology is coherent with other existing external ontologies

For example, when I linked BIBO classes to UMBEL subject concept classes, I found that a bibo:Series was a sub-class of umbel:ConceptualWorkSeries. Then I found that bibo:Periodical was the same thing as a umbel:PeriodicalSeries. However I had an issue: a bibo:Series was a sub-class of bibo:Collection and bibo:Periodical was also a sub-class-of bibo:Collection. Then I found that umbel:PeriodicalSeries was a sub-class of umbel:ConceptualWorkSeries. Then the question arose: why bibo:Periodical is not a sub-class of bibo:Series instead of bibo:Collection? This is what I will propose for the next iteration of BIBO.

Now, what about this helping to increase the coherence between external ontologies?

One good example I have is related to SIOC and FOAF. When I linked SIOC to UMBEL, Kingsley asked me why I didn’t link sioc:Item. My answer was simple: I cant do this since if I make this linkage, the coherence of UMBEL will be disturbed. The problem was that sioc:Item was a sub-class-of foaf:Document. But considering sioc:Items definition, and foaf:Documents definition and linkage to UMBEL, by making the linkage of sioc:Item to UMBEL would create some incoherence in the framework because of its relationship with foaf:Document.

From this discussion with Kingsley, this thread appeared on the SIOC mailing list, and the link from sioc:Item to foaf:Document has been removed.

These are the two general cases where UMBEL, as a coherent framework, can help the development of ontologies.

So, by achieving points (1) and (2), we are on the way to achieve point (3): the coordination of the development of ontologies and the cooperation of the agents working on these ontologies projects is effective to achieve goals (1) and (2).

The Final Mapped Relations

So, after application of this process and thinking, here are the UMBEL-BIBO mappings:

You can look at Appendix A to the UMBEL technical document (PDF or online); additionally you will see similar mappings for the existing dozen or so ontologies presently mapped to UMBEL. In combination, these give us the ability to Explode the Domain!

Descriptive Subject Concepts: Icing on the Cake

All of the description above relates to the mapping between the BIBO and UMBEL ontologies (and therefore other external ones). But, of course, we also now have the full scope of UMBEL subject concepts that we can also now apply to describe what the actual BIBO citations are about.

So, while we have structural ontology relationships that can be leveraged, we also now have a common vocabulary to describe the subject matter of what these citations are about. Use of these UMBEL subject concepts now allow us to cluster and retrieve citations by subject matter.

In this manner, UMBEL becomes a consistent tagging vocabulary for describing what citations and references are about. Want everything about weaving or galaxies or opera or anything, for example? Simply characterize your citations by appropriate UMBEL subjects and then use them as part of your retrieval filters.

This makes clear that UMBEL is some kind of Hydra: it can be used as a conceptual framework to help make ontologies (vocabularies) coherent and consistent, and at the same time, it can act as a conceptual description framework that describes the “matter” of things. This means that a subject concept can describe the “nature” of a thing and the “matter” of another thing at the same time.

Conclusion

UMBEL is becoming a wonderful tool that can be used in many ways. It is a vocabulary that is instantiated in a subject concept structure. It can be used not only to categorize things and to help find things, but also to define things, and to develop ontologies that define other things. We are on our way to achieve these three goals:

  1. Develop ontologies that are in context
  2. Develop ontologies that remain coherent
  3. Coordinate the development of ontologies and the cooperation of the agents working on these ontologies projects sufficient to achieve goals (1) and (2).

As usual, I’d like to thank my UMBEL co-editor and colleague, Mike Bergman, for his discussions and assistance on this material.

UMBEL version 071 Released

Print This Post Print This Post

We have just released a new version of UMBEL (v 071).  This new version is based on a new version of OpenCyc that has been updated with the latest knowledge base version 5014. This is the latest version of OpenCyc they released after we met Cycorp and the Cyc Foundation a couple of weeks ago in Austin. In the meantime we also fixed some things and enhanced the UMBEL concept structure.

Here is the list of changes and fix:

  • The UMBEL subject and abstract concept structure is based on OpenCyc kb5014
  • The UMBEL namespaces changed
  • UMBEL subject concepts now link to OpenCyc classes and individuals
  • The UMBEL generation scripts now uses the OpenCyc external IDs
  • Duplicated lines in the file umbel_cytoscape_vXYZ.csv have been removed
  • The linkage of BIBO to UMBEL has been completed
  • The linkage of FOAF and SIOC to UMBEL has been revised
  • The encoding of the character “%” in the named entities dictionaries N3 files has been fixed
  • The UMBEL technical documentation has been updated according to this list of changes.

Now lets talk about some major changes of this new release.

New UMBEL namespaces

We changed the UMBEL namespace URIs to be more consistent moving forward. Here is the fuller rationale:

“Here are the URIs of the namespaces used to describe the UMBEL Ontology, the subject concepts structure, the named entities defined in UMBEL and the semsets for both the subject concept classes and named entities.

The folder structure of these classes of URIs has been generalized to meet the design goals of using UMBEL with domain extensions. The portion “/umbel/” in the URIs is a placeholder for the name of these extensions. Each extension, including UMBEL itself, will share the same identification structure. An example for a ‘Foo’ domain ontology at an alternative example.com domain using the “/foo/” folder extension is shown in the table below.

The UMBEL Ontology vocabulary URI uses a “hash URI” for convenience purposes. This facilitates the retrieval of the document of the descriptions of the vocabulary for tools that consume such documents. However considering the size of the subject and abstract concepts descriptions files, the named entities and semset files, we choose to use “slash URIs” so that consumer tools do not have to download the description of all subject and abstract concepts, named entities and semsets descriptions when they request the description of one of these resources.”

The new namespaces are defined as:

Name

Abbreviation

URI

UMBEL Ontology

umbel:

http://umbel.org/umbel#
Subject Concepts

sc:

http://umbel.org/umbel/sc/
Abstract Concepts

ac:

http://umbel.org/umbel/ac/
Named Entities

ne:

http://umbel.org/umbel/ne/
Semsets

semset-xyz

http://umbel.org/umbel/semset/xyz/
Example, English semset

semset-en

http://umbel.org/umbel/semset/en/
FOO Ontology (a domain ontology based on UMBEL)

foo:

http://example.com/foo#

We now consider these new URIs as “frozen”. So please update your application with these new URIs.

UMBEL subject concepts that link to classes and individuals

In some edge cases, UMBEL considers that an OpenCyc individual is a subject concept or an abstract concept. This means that not only OpenCyc classes can be selected to be UMBEL subject concepts, but OpenCyc individuals can be as well. The definitions of UMBEL subject concepts, abstract concepts and named entities guide how the corresponding OpenCyc collection (”class”) or individual is treated. If an UMBEL subject concept is related to a OpenCyc collection (”class”), then the linkage between these two resources will be done with the property owl:equivalentClass. If an UMBEL subject concept is related to a OpenCyc individual, then the linkage between these two resources will be done with the property owl:sameAs. Check the volume 2 to know what we consider as subject concept, abstract concepts and named entities.

Use of OpenCyc classes’ external IDs

UMBEL subject and abstract concepts names have been used for convenience only. When a new version of UMBEL is created, the “external IDs” of the OpenCyc classes are used to link these classes to UMBEL subject and abstract concepts. That way, if their naming conventions change from an OpenCyc version A to a version B, then we are still able to update the proper UMBEL concepts according to their new OpenCyc definitions. Note that the OpenCyc external IDs are only used when we create a new version of UMBEL. Otherwise the URIs of the UMBEL subject and abstract concepts use the “human readable” labels to refer to the concepts.

Linkage between OpenCyc and UMBEL

We have to note that OpenCyc added linkage from the OpenCyc classes to the UMBEL subject concepts classes. This means that if someone dereferences OpenCyc classes URIs, they will have a reference to UMBEL subject concept classes via the property owl:sameAs.

Still to come

While much progress has been made in this new version 071, there are some pending issues and tasks not in the current release:

  • Complete Web service and endpoints release (forthcoming in a few days)
  • Re-inclusion of company provinces, states and territories
  • Automatic instance checks to ensure better coverage of more specific concepts in the ontology.

We are continuing to work out test and automation procedures with Cycorp and will incorporate these improvements as well in subsequent releases.

Conclusion

This new release is one more step in the good direction. UMBEL is getting more and more stable. Its relation to OpenCyc is stronger and stronger. And its linakge to external ontologies is bigger and bigger. Please report any issues, comments or suggestions on the mailing list.

Starting to Play with the UMBEL Ontology

Print This Post Print This Post
I am really proud to announce the first public release of the UMBEL Ontology and its subject structure after one year of hard work with Mike.

As UMBEL is introduced in the UMBEL Technical Documentation:

“UMBEL (Upper-level Mapping and Binding Exchange Layer) is a lightweight ontology for relating external ontologies and their classes to UMBEL subject concepts. UMBEL subject concepts are conceptually related together using the SKOS and the OWL-Full ontologies. They form a structural ‘backbone’ comprised of subject concepts and their semantic relationships. By linking external ontologies to this conceptual structure, we explode the domain of the linked classes by leveraging this conceptual structure.

UMBEL defines “subject concepts” as a distinct subset of the more broadly understood concept such as used in the SKOS/OWL-Full controlled vocabulary, conceptual graphs, formal concept analysis or the very general concepts common to many upper ontologies. We define subject concepts as a special kind of concept: namely, ones that are concrete, subject-related and non-abstract.

UMBEL contrasts subject concepts with abstract concepts and with named entities. Abstract concepts represent abstract or ephemeral notions such as truth, beauty, evil or justice, or are thought constructs useful to organizing or categorizing things but are not readily seen in the experiential world. Named entities are the real things or instances in the world that are themselves natural and notable class members of subject concepts. More detailed distinctions are provided under Terminology and Definitions below.”

Mike Bergman wrote a really good introduction blog post about UMBEL that lists all the supporting material and services that exists to get starting with UMBEL.

In this blog post I will write about one example that shows how to leverage UMBEL in two different ways: (1) how to use UMBEL to “explode the domain” of an existing ontology and (2) how to use UMBEL when an ontology doesn’t exist to describe a certain domain. I will also write other blog posts in the coming days to show more ways to leverage UMBEL in different settings and how to use it to solve other kind of real world problems.

Some of this new material will begin to hint at Zitgist’s own plans for using UMBEL.

Linking FOAF to UMBEL to explode its domain

How many times have people tried to use FOAF to describe organizational entities? In the end, everything ends up being assigned to foaf:Organization. A company, a NGO, or any other kind of organizations were all foaf:Organization(s) or foaf:Group(s). In most cases the result was unsatisfactory and everything ended up being the same “classification”.

But I don’t want to describe a business as an “Organization”, or a NGO as another “Organization”. They are two quite different concepts, even if the upper concept that links them is an “Organization”. However, there are no ontologies (that I know of) that describe businesses and NGOs; and FOAF is not expressive enough to do that distinction. However, is it FOAF’s goal to be that expressive? Possibly; but not in its current state. So what we want here is to extend it: to explode its domain!

And, it is what we will do with UMBEL.

The goal is to link FOAF classes to UMBEL subject concepts so that we can extend FOAF’s classes with more general and more specific concepts such as Business and NGO.

If you take a look at how the FOAF ontology has been linked to UMBEL, you will notice that a foaf:Organization is equivalent to an sc:Organization. Note: the linkage of external ontologies classes is consistent within UMBEL. It is UMBEL’s view of the World.

Let’s take an example to show what I mean. What I want is to describe the Zitgist LLC business; to describe it as a business, and not an organization. However I want to be able to re-use properties described in other ontologies to describe this business. So, here is an example of how I can describe this company using a UMBEL subject concept and external ontologies properties:

<http://zitgist.com/about/> a sc:Business ;

foaf:name “Zitgist LLC.” ;

foaf:birthday “2006-10-20″ ;

foaf:logo <http://zitgist.com/imgs/zitgistlogo2_110_55.gif> ;

foaf:fundedBy <http://www.openlinksw.com> ;

bio:olb “”"Zitgist provides quality Linked Data products and services. Linked Data is based on open standards to interconnect any form of relevant information on demand and in context. Zitgist’s capabilities range from the consumer Web plug-in zLinks to enterprise linked data transformation and deployment. Our expertise spans from data, standards and protocols to tools, user interface design, and scalable architectures. Zitgist innovation helps make the connections that matter. Let us show you how our approach to Linked Data can bring the power of the network effect to your data assets and global information.”"”@en ;

foaf:based_near [ geo:Point [geo:lat "42.455", geo:long "-71.218"] ] ;

foaf:homepage <http://zitgist.com> ;

foaf:made <http://umbel.org/about/> ;

foaf:made <http://browser.zitgist.com/about/> ;

foaf:made <http://pingthesemanticweb.com/about/> ;

foaf:made <http://musicontology.com/about/> ;

foaf:made <http://bibliontology.com/about/> ;

foaf:made <http://talkdigger.com/about/> .

As you can notice with this example, Zitgist is defined as a sc:Business. Well, you are probably wondering what is a sc:Business? Let’s take a look at the subject concept’s detailed report: sc:Business.

The next question is: why can I use all these properties to describe a sc:Business? The quick answer is because foaf:Organization is linked (equivalent to) sc:Organization and that sc:Business is a sub class of sc:Organization. You can read the proof here; and check the figure below that shows the inference path that leads us to this result.

(Note: this is what we refer to: exploding the domain of FOAF)

Analyzing a SC with the Detailed Report

The Detailed Report web service tool helps users to check which external class is linked to which subject concept and the nature of the linkage. Additionally it helps people to know what properties can be re-used to describe an individual of that class. Here is a quick overview of what information can be accessed when using this detailed report tool. Let’s take the sc:Business detailed report page:

Named Entities

The Named Entities section lists a couple of named entities that belong to this subject concept class. These are direct, or inferred, Yago named entities that belongs to this subject concept.

More General External Classes
The More General External Classes section lists the external super-classes linked to this subject concept. So we can quickly notice that a sc:Business is a foaf:Organization, a foaf:Group and a foaf:Agent.

In-domain-of and In-range-of
The in-domain-of and in-range-of sections list the properties, defined in some external ontologies, that can be used to describe that subject concept. So most of the properties that I used to describe the Zitgist business above should appear in this list (except if the ontology hasn’t yet been linked to UMBEL; but a dozen are already so as shown in Appendix A of the main technical document).

More General and Specific Subject Concepts

The More General Subject Concepts and the More Specific Subject Concepts sections list the super-concepts and the sub-concepts of the current subject concept (sc:Business in that case). So, we can use UMBEL to describe an even more specific kind of business, for example: an Airline Company. Or we can use UMBEL to describe a more general kind of business: a Commercial Organization.

Finally this Detailed Report Web Service helps people to put a given subject concept into context: how it is related to external ontologies classes; how we can use properties to describe individual of these concepts; how is it related to other subject concepts? It is the tool to answer these questions.

Conclusion

In this blog post we saw how external ontologies classes can be linked to UMBEL to explode their domain: so to enhance their expressiveness. Additionally we saw how to use UMBEL web services to analyze a subject concept and to see its relations with other subject concepts, external classes and properties.

However this is just the beginning of our exploration of UMBEL. Many things are waiting for us at the corner. I am starting to write a series of blog posts that will show you different uses and characteristics of UMBEL. All of them will be explained using real world use cases and challenges. We will see how named entities are related to UMBEL subject concepts. We will see how named entities data sources such as Yago and the John Peel Sessions have been linked to UMBEL. We will see how the UMBEL Vocabulary can help people to describe subject relationship between: a RDFS class that can be linked to a subject concept (using umbel:isAligned and owl:equivalentClass); a named entity to a subject concept (using umbel:isAbout and umbel:linksEntity); and a named entity to another named entity (using umbel:isLike and owl:sameAs).

As you can notice, this is just the beginning. In meantime you can read the technical documentation to have a better understanding of UMBEL. And additionally you can read all the volumes that have been written to explain UMBEL’s evolution and the steps that lead to the creation of this of this first public release of the ontology.

Finally, you can now start using UMBEL in your own applications.  I would suggest you to revisit the UMBEL web services by reading my previous blog post: Exploding the Domain: UMBEL Web Services by Zitgist. Additionally I would suggest you to try to dereference subject concepts URIs such as: http://umbel.org/ns/sc/Project and http://umbel.org/ns/sc/Organization. All UMBEL Vocabulary’s classes and properties are dereferencable. All UMBEL named entities are also dereferencable along with all subject and abstract concepts.

Enjoy!

Exploding the Domain: UMBEL Web Services by Zitgist

Print This Post Print This Post
I am pleased to announce the first phase of the public release of the UMBEL Web Services by Zitgist. This first release consists of a series of user interfaces in-front of several UMBEL web services.

This blog post shows and explains what these web services are about and how people will be able to use them to leverage UMBEL to create new ontologies, to instantiate new data sets and to interlink external ontologies to explode their domains.

Background

For the last four to six months we have been in the process of creating the UMBEL ontology. We have been doing research to find the best basis datasets; we have been cleaning these datasets for UMBEL’s purposes; and we have been developing the ontology and its principles. Starting today, we begin the release process for UMBEL:

  1. UMBEL web services’ user interfaces
  2. UMBEL ontology (OWL-Full)
  3. UMBEL ontology technical documentation
  4. UMBEL subject concepts’ structure (SKOS + OWL-Full) & named entities instantiation
  5. UMBEL web services endpoints.

UMBEL Ontology & Subject Concept Structure

Before starting to show and explain the UMBEL web services’ user interfaces’, I have to give some background information about the UMBEL ontology’s principles, and how the subject concept structure has been created. All this information will be discussed and explained at length in the UMBEL ontology technical documentation that is about to be published; but I have to give some technical background information in order to explain what these web services are about.

As described by Mike, UMBEL’s purposes are:

“[...] to provide a lightweight structure of subject concepts as a reference to what Web content or data “is about”, what is called a concept schema in SKOS [...]

Think of the backbone as a set of roadsigns to help find related content. UMBEL is like a map of an interstate highway system, a way of getting from one big place to another. Once in the right vicinity, other maps (or ontologies), more akin to detailed street maps, are then necessary to get to specific locations or street addresses.

By definition, these more fine-grained maps are beyond UMBEL’s scope. But UMBEL can help provide the context for placing such detailed maps in relation to one another and in relation to the Big Picture of what related content is about.

These subject concepts also provide the mapping points for the many, many thousands (indeed, millions) of specific named entities that are the notable instances of these subject concepts. Examples might include the names of specific physicists, cities in a country, or a listing of financial stock exchanges. UMBEL mappings enable us to link a given named entity to the various subject classes of which it is a member.

And, because of relationships amongst subject concepts in the backbone, we can also relate that entity to other related entities and concepts. The UMBEL backbone traces the major pathways through the content graph of the Web. For some visualizations of this subject graph, see So, What Might The Web’s Subject Backbone Look Like?”

A four-article introduction to UMBEL can be read from Mike’s blog at:

UMBEL is a 21 000 subject concept structure that has been derived from the OpenCyc ontology. The structure is described in SKOS and OWL-Full. Each concept is an invididual of the skos:Concept class, which are themselves OWL classes. This dichotomy is the basis of UMBEL. Since the subject concepts are classes, this mean that we can relate these classes to external ontology classes using properties such as rdfs:subClassOf and owl:equivalentClass.

So what does all of this mean? It means that once the linkages between UMBEL subject concepts and external ontologies classes are made, the following becomes possible: 1) the UMBEL subject concept structure can be used to describe (instantiate) things using the UMBEL data structure; 2) external ontology properties can be re-used to describe these new instances since external ontologies classes are linked to UMBEL subject concept classes; and 3) in some cases, the properties defined in these ontologies can be used in relation with UMBEL subject concept classes. The forthcoming technical documentation about this stuff will provide more detailed explanation. For the moment, just accept these assertions as being true.

The UMBEL web services (user interfaces) have been created to help people to manage these relationships between UMBEL subject concepts classes and external ontology classes. People will use the services to infer facts from the structure of the subject concepts, to check if a class is a sub-class, a super-class or an equivalent class of another class. They will also use the services to see what properties, defined in external ontologies, can be re-used, and on which subject concept.

Let the show begin!

UMBEL Web Services Index Page

The entry page lists all the available web services. For each web service, you have a link to the web service user interface, a link to an about page explaining the basis of the web service, and a link to the technical documentation of the web service endpoint: how to communicate with the endpoint web server and how to interpret the answer sent by the web service.

Take note that the web service endpoints are not yet publicly available, and that this endpoint page is provided now for information purposes.

Eleven UMBEL Web Services

  1. Find Subject Concepts
  2. Subject Concept Report
  3. Subject Concept Detailed Report
  4. List Sub-Concepts & Sub-Classes
  5. List Super-Concepts & Super-Classes
  6. List Equivalent External Classes
  7. Verify Sub-Class Relationship
  8. Verify Super-Class Relationship
  9. Verify Equivalent Class Relationship
  10. Subject Concepts Explorer
  11. Yago Ontology — a little help from our friends.

Searching the UMBEL Subject Concept Structure

The first thing people will want to do is to search within the UMBEL subject concept structure. The “Find Subject Concepts” web service helps people to locate potential subject concept they are looking for.

If someone looks at the Find Subject Concepts page and performs a search for the keyword “project”, he will get this list of subject concepts:

umbel_find.png

Note: all subject concepts are ordered alphabetically and the search has been performed on the subject concept label and their semsets (and not in their definition).

The “finding” web service along with all the inferencing web services use the same result page layout: you have a list of subject concepts with their human readable definition (note: 8000 definitions out of 21 000 have yet to be created). If a user clicks on a result, he will be redirected to the Report and the Detailed Report user interfaces. Additionally, a user can click on the small “earth” icon to start browsing the surrounding subject concepts nodes in the Explorer visualization tool.

Inferencing the UMBEL Subject Concept Structure

A series of web services has been created to infer facts in the UMBEL subject concept structure. There are the two main categories of inferencing web services:

  1. The ones that list subject concepts that are more general, more specific or equivalent to a given subject concept
  2. The ones that answer the question: is this subject concept a sub-concept, a super-concept or an equivalent concept to this other subject concept?

These web services can be used not only to infer these facts on UMBEL subject concepts, but also on external ontology classes. There are a couple of examples of what can be done with these inferencing web services:

Note: some people may notice that the doap:Project external ontology class is a sub-class of the “Project” subject concept. This is not intuitive for humans, but this situation will be explained at length in the UMBEL Ontology Technical Documentation. To make a long story short: considering the nature of the current definition of the doap:Project class, we couldn’t say that it is equivalent to the “Project” UMBEL subject concept.

Visualizing the UMBEL Subject Concept Structure

While inferencing and lookup are good, we still have some issues when we try to “feel” what the UMBEL subject concept structure is. The following two user interfaces will do their best to help people visualizing the subject concepts description and their relations with other subject concepts and external ontologies classes.

Lets start with a wonderful visualization tool, created by Moritz Stefaner, and used by UMBEL to let people visualizing and browsing the data structure.

Lets start by browsing the relationship of the “Project” subject concept:

umbel_explorer.png

You can navigate from one node to another by clicking any of the circles. Each circle is an UMBEL subject concept or an external ontology class.

When a node is selected, its concept description is displayed in the right sidebar of the interface.

Note there are four different kinds of relationship between the concepts:

  • Blue (B). (concept A) — broader than –> (Concept B). concept A is more general than concept B
  • Red (N). (concept A) — narrower than –> (Concept B). concept A is more specific than concept B
  • Green (=). (concept A) — equivalent to –> (Concept B). concept A is equivalent to concept B
  • Mauve (I). (concept A) — is a –> (Concept B). concept A is an instance of the concept B

As each node is selected, the display refreshes and shows the new set of relationships for the current node (subject concept or external class). Note the dropdown list shown at the upper right of the display enables you to return to previous views or steps.

The Detailed Subject Concept Report

The detailed subject concept report is the tool to know everything about a specific subject concept. This is not really a web service, but a user interface that uses all existing UMBEL web services to display a detailed report of a subject concept, and all its relations with other UMBEL subject concepts and external ontology classes and properties.

There is the detailed report of the “Project” subject concept:

umbel_detailed_repost.png

There is the list of information available from that detailed report page:

  • UMBEL Subject Concept Name — the name of the subject concept
  • Semset — the preferred label and its alternative labels used to refer to this concept. The alternative labels are aliases, synonyms, collocations, etc.; related to the preferred label of the subject concept
  • Definition — the human readable definition of the subject concept
  • Equivalent External Classes — the classes from external ontologies that refer to this same subject concept. Note that the UMBEL Ontology Technical Documentation will explain how the equivalence relation between an external ontology class and an UMBEL subject concept is done
  • Named Entities — a list of named entities related to this UMBEL subject concept. Most of the time, the subject concept has the “type of” characteristic for these named entities. For example, for the subject concept “Person”, “Albert Einstein” is of type “Person”. The first named entities data set that has been used to create this list of named entities is Yago (more about this below).
  • More General External Classes — these are the classes from external ontologies that refer to a more general concept. Note that the UMBEL Ontology Technical Documentation will explain how the super-class relation between an external ontology class and an UMBEL subject concept is done
  • More Specific External Classes — these are the classes from external ontologies that refer to a more specific concept. Note that the UMBEL Ontology Technical Documentation will explain how the sub-class relation between an external ontology class and an UMBEL subject concept is done
  • In-domain-of — this is a list of properties defined in external ontologies where an individual of the UMBEL subject concept class can be used in the domain of the property. For example, for the subject concept “Person” the in-domain-of property: “foaf:interest (domain: foaf:Person)” means that an individual of the class umbel:Project can re-use the property foaf:interest that is defined in the FOAF ontology in its domain (<umbel:Person> <foaf:internet> <…>). Note that the UMBEL Ontology Technical Documentation will explain how the in-domain-of relation between an external ontology class and an UMBEL subject concept is done
  • In-range-of — this is a list of properties defined in external ontologies where an individual of the UMBEL subject concept class can be used in the range of the property. For example, for the subject concept “Person” the in-range-of property: “doap:developer (range: foaf:Person)” means that an individual of the class umbel:Project can re-use the property doap:developer that is defined in the DOAP ontology in its range (<…> <doap:developer> <umbel:Person>). Note that the UMBEL Ontology Technical Documentation will explain how the in-range-of relation between an external ontology class and an UMBEL subject concept is done
  • More General Subject Concepts — this is the list of more general internal UMBEL subject concepts related to the concept
  • More Specific Subject Concepts — this is the list of more specific internal UMBEL subject concepts related to the concept.

As you can notice, all the relations between any UMBEL subject concept to other subject concepts or external ontologies classes and properties is shown in this detailed report page.

This detailed report page was created not only to show people what UMBEL subject concepts are. I envision that people (more specifically ontologies developer & ontologies users) will also use it to check the current linkage between UMBEL and external ontologies and how to use UMBEL to instantiate and describe resources in RDF, etc. The UMBEL ontology documentation will describe some linkage and re-using use cases in further detail.

Linked External Ontologies and Named Entities

Lets take a deeper look at the named entities section of the detailed report of the “Person” subject concept:

umbel_named_entities.png

These named entities are individuals belonging to the class umbel:Person. If you click on one of these person names, you will notice that they are described the Yago data set. How is this possible?

To make another long story short: umbel:Person is an equivalent class to the cyc:Person class; cyc:Person is an equivalent class to the wordnet:Person class; yago:R._B._Bennett is an individual belonging to the same wordnet:Person class. So we can infer that yago:R._B._Bennett is an individual also belonging to the umbel:Person class. However, these technical details will be explained at length in the UMBEL ontology documentation.

But the truth is that this is not the most wonderful thing around. The most wonderful thing is when we understand what that really means (the linkage between yago:R._B._Bennett and umbel:Person (or any other data sets linked to UMBEL)). This means that this linkage is literally exploding the domain of each of these linked named entities. In fact, now we know this about yago:R._B._Bennett:

  • It is an umbel:Person
  • It is a cyc:Person
  • It is a foaf:Person & a foaf:Agent
  • It is a umbel:HomoSapiens
  • It is a umbel:SocialBeing
  • That we can re-use the foaf:birthday, foaf:name, doap:translator, dcterms:creator, etc.; external ontologies properties to describe this person.

We can infer all these things, and much more, about yago:R._B._Bennett only by linking it to UMBEL. We just contextualized it; and then we exploded its domain!

This is what UMBEL is about; this is the value it creates; and its contribution to the Semantic Web.

Conclusion

This is just the beginning of UMBEL. Currently ten external ontologies have been linked to UMBEL. The attentive eye will notice some strange results in the in-domain-of and in-range-of detailed report sections. More work has to be put in the linkage; however as you will notice in the technical documentation of UMBEL, some weird results come from the way some ontologies are defined. So, these ontologies self-definition create some of these weird results. So this mean that these UMBEL tools won’t only help by linking external ontologies, but they will also help to define new ontologies and to fix existing ones.

Stay tuned; more stuff will be released in the coming weeks and months.