Blogs, WordPress, Zitgist and the Semantic Web

rdf-zitgist-wordpress.png Every link has a relation on the Semantic Web. Each time a person create a link from a web page to another web page, it does much more than simply linking… In fact, the Web and the Semantic Web are starting to mesh together.

The meshing is occurring at the level of the URI, or more specifically at the level of the URL if we are talking about the Web. This is what I will show you in this post using a WordPress plug-in I developed using Zitgist technologies.

Motivations driving the development of the plug-in

The first objective of this project is to try to find out how people could integrate semantic web concepts and principles in the systems they daily use. How can we integrate the semantic web into Blogs for example? Is the use of semantic web technologies only good at publishing content in RDF? This is certainly one thing, but I doubt it is the only one. This is for that reason why I put some time in developing this prototype.

The second motivation is to create a good prototype of a system using Zitgist’s architecture to show people how they can take advantage of Zitgist to develop their projects; to make their vision a reality.

Some background thinking about the plug-in

On the Web, people mainly manipulate web page resources. They locate them on the Web using a unique locator, called a URL. On the semantic web on the other hand, people do not only manipulate documents; they manipulate many kind of Things, many kind of resources. They refer to them using URIs. The difference between a URI and a URL is that a URL is resolvable on the Web, but not necessarily a URI (in fact, a URI is the super-class of a URL). However, best practices suggest people to make URI resolvable (dereferencable) on the Web; in such a case a URI is a URL.

Anyway, all this to say that a URI in the semantic web can be a URL on the Web. There are many use cases emerging from that special digital environment. As an example, many people will use a Wikipedia Wiki Page URL as an URI for a topic, an interest, or for many other relations to these concepts. In such a case, the URL of a webpage is used to refer as a Concept. I don’t want to discuss about the basis of this, but it is a fact, and we have to handle it.

Introduction to the Zitgist WordPress Plug-in

This plug-in is quite simple in appearance, but has some really interesting results for users.

The only thing this plug-in does, is to show blog readers existing related data for a given URL and, in some case, to enable them to perform actions based on this data.

By example, if I make a link to Tim Berners-Lee‘s web page, a user could be interesting in having more information about Tim, directly from the article he is reading. Tim has many data related to him from the semantic web.

timbl.png

That is it. The plug-in display related information to links from a blog post. In this case, it is people Tim knows and Tim’s profile. The information is shown the users using a contextual menu. The data is requested to Zitgist’s systems and is displayed to the user. This is that simple, but how powerful?

The usefulness of the Zitgist WordPress Plug-in

The plug-in is quite useful in many ways. In fact, it instantly displays related information about a link to readers of the blog. From any blog post, a reader can easily jump to resources related to each link.

Some use cases

Above I said that a URL, a web link, could be much more than it usually appears. So bellow, I show a couple of use cases showing the potential behind the idea.

1. URL as a web page

What happen when a link from a blog post is a URL? Well, some things can happen, and there is an example:

Check it by yourself: The Bibliographic Ontology

bibo.png

Here a user can check the webpage directly, or he can jump to related resources. These related resources come from the semantic web. The first one is the description of the project. The following two are the authors of the ontology. The last resources are documents related to the ontology and the “version” of the ontology.

2. URL as a dereferencable URI

For the non-initiated readers, I would suggest you to read this best practice tutorial explaining how to publish semantic web data on the web.

Sometimes (okay, not that much at the moment, but I hope people will start), people link to resource URIs (so, URL that can be dereferenced to get RDF data about the resource, or its web page representation if available).

Check it by yourself: URI referring to Frederick Giasson

fgiasson.png

The result is that readers have directly access to my profile, articles I wrote, etc.

3. Actionable URL

Sometimes it can be really interesting to be able to act according to some URLs. One example is when a web page, or a resource (identified by a URI) refers to a thing that can be bought. By example: something that can be bought on Amazon.com:

Check it by yourself: Visualizing the Semantic Web

amazon.png

From the blog post, the reader can automatically buy the related resource on Amazon. This is only one possible action, but many others are possible; the only limit is imagination.

Conclusion

The simple links you create from your blog posts to other web pages have much more related information than you can think. Using this prototype Zitgist WordPress plug-in will explicit these links for your reader.

You only have to read some of my other blog posts to try it by yourself. Some results are quite impressing.

I will make this plug-in available for download sometime next week.

This idea has been promoted by Kingsley Idehen for some time now. He uses to call this idea enhanced anchors, or, a++. The idea is simple: enhancing anchors to explicit links to a certain resource (URI or URL), and optionally to perform some action on them.

This prototype is a first try in that direction. Many upgrades should follow so we really unveil the power of this new kind of linking; of this new way to relate things together, and to explicit these relations. Please report me any bug, issues, cross-browsers problems, comments, suggestions, etc.

The Open Library in RDF using The Bibliographic Ontology

openlibrary.png

“What if there was a library which held every book? Not every book on sale, or every important book, or even every book in English, but simply every book-a key part of our planet’s cultural legacy.” — The OpenLibrary Project

This is what I wanted to participate to.

The Open Library is a project that wants to archive information about every book (probably writings) created by mankind. Such a strong vision is naturally closely related to the semantic web.

I contacted Aaron Swartz about this project. I wanted to know what were their plans about making all this data available on the semantic web; what was their plan to describe these books into RDF.

I wanted to participate to the project by describing their information into RDF using the Bibliographic Ontology.

So it is what I started to do. Aaron sent me some snapshots of data using their current database schema (this schemas should be updated soon). Then I described one of them using BIBO. As you will see bellow, the ontology neatly describes the Open Library data and enable us to query, at the same time, the Open Library’s data, the data about the articles I wrote, eventually the Zotero citations if they choose to use BIBO, etc.

So, bellow is my proposition to Aaron and to the Open Library Project. From this post, we will be able to discuss about the implications, how this could be done, how the data could be made available for querying and browsing, etc.

How to Cook Revised Edition described using RDF and BIBO

The current use case is a book by Raymond Sokolov: “How to Cook Revised Edition“. It has been straightforward to map this data into BIBO using the current proposition.

The RDF/N3 example is available here: How to Cook Revised Edition in RDF/N3

Describing this data using BIBO leaded me to find out how to describe topical subjects of documents. It is a discussion we (the BIBO development community) already had, and here I think I found a solution.

Describing topical subjects for a bibo:Document

The goal is to relate a document resource with the concepts describing their topics. There are many ways to describe subjects of documents: it could be with a literal, a class, an individual, etc.

What I am proposing here is to re-use the dcterms:subject property (has we already do) to relate a bibo:Document with the concept of a taxonomy that will acts has the topical subject of a document.

The Open Library is using the BISAC subject standard to relate books with their topics. What I have done is to describe the BISAC standard as a taxonomy in RDF using SKOS. The resulting RDF is: BISAC taxonomy snapshot.

As you can notice, the BISAC taxonomy structure is well-described using SKOS concepts. The relation between these concepts is described as well. Also, the dcterm:identifier property is used to link a concept with its BISAC identifier.

From there, we only have to use the BISAC URIs to link a bibo:Document to its subjects like:

dcterms:subject <http://purl.org/ontology/bibo/bisac#Cooking_Regional_and_Ethnic_American_General> ;
dcterms:subject <http://purl.org/ontology/bibo/bisac#Cooking_General> ;

This is simple and effective. Also, we are not limited to the BISAC taxonomy; one can use the taxonomy he wants to describe subjects of its documents.

Some SPARQL queries

Nothing is better than SPARQL queries to “feel” the power of these RDF descriptions.

Queries related to contributions

The following query will display the documents’ title and the contribution role of Raymond Sokolov. So, if Raymond contributed to some documents as an author and editor, and all these documents will be returned in the resultset:

Finding documents where Raymond Sokolov contributed
The following query is a variable of the above. It will returns all the documents’ title where Raymond is an author.

Finding documents where Raymond Sokolov contributed as an author

Eventually we could also use the bibo:position to know all the documents wrote by Raymond where its author position if less than 2 (so, where he is a primary or secondary author of a document).

Queries related to documents and their subjects

If a user only has the BISAC identification number of a concept, and that he needs to find books about this topic, then he only has to run this query to get the titles with that topic:

Finding documents related to a BISAC identifier

However, it is not really handy. What if I only want books about “cooking”? There is a way to go:

Finding documents about “cooking”

That way, you will get all the “cooking” related concepts from the taxonomy, and you will find all the related books.

Note that there are many other ways to go such as browsing the graph of concepts using the skos:narrower and skos:broader properties from a given skos:Concept. However, the query above is simple and effective.

Other queries

Otherwise you can create a full set of other simple and effective queries by searching all the published books, all the published books by a given author or editor, etc.

There is no limit when all that information is available in RDF and BIBO.

More descriptions of the Open Library using BIBO

If you take a closer look at the current database schemas of the Open Library Project, you will notice that have data about “series”, “notes”, and other things. I don’t have such an example in hands at the moment, but we have to keep in mind that we can easily describe them using BIBO as well.

Conclusion

I described how RDF and The Bibliographic Ontology could be use to describe data from The Open Library Project. Doing this would enable them to easily and effectively publish their data so that other people and applications could take advantage of it.

We also found that it is a powerful method that we can easily use to search complex graphs of relations created by such data described in RDF and BIBO.

Finally, having all this data available in BIBO will enable us to easily merge it with other document data sources such as Zotero or any other writings described using RDF and BIBO. As a final example, we could, for example, find all the documents that Raymond Sokolov contributed to create, as an author, and editor, or whatever. With a single query, once could find out that he wrote some published books, and that he authored some posts on its blog. All that thanks to the RDF, BIBO, SPARQL and all the data sources exporting their data using RDF and BIBO.

Ping the Semantic Web version 3: a brand new system!

ptswlogo160.gif Pinging and receiving list of newly created and updated RDF resources has never been easier! I am pleased to announce the release of the latest version of Ping the Semantic Web.

In this brand new system you have access to a:

  1. Validated RDF resources
  2. Simplified pings list export system
  3. Faster pinging infrastructure
  4. Brand new user interface
  5. New statistics

1. Validated RDF resources

In the version 2.0, PTSW was doing a pseudo validation of RDF files. In the version 3.0, it fully validates RDF documents. This means that all pings the service export are valid RDF documents.

This is a major upgrade to the system since now all agents requesting pings from PTSW will know that each of them are valid RDF documents. That way, they will save time and bandwidth since they won’t try to process bad RDF documents.

2. Simplified pings list export system

Now all ping consumers need to be registered to the PTSW web service. This simple registration greatly helps consuming pings coming from PTSW. There are the steps to follow to get pings from the server:

  1. The user have to register an account on pingthesemanticweb.com
  2. He has to register the IP address of the server that will download the xml file listing all the latest pings received from the system
  3. Additionally, he has to setup his pings retrieval preferences in the user account section.
  4. The registered web server has to request the xml file at: http://pingthesemanticweb.com/export/

What improved is the way applications get pings. Now a web server only has to request the xml file, and PTSW will take care to created the xml file according to the user’s preferences.

Finally, PTSW is archiving the time of the latest request of the user. Next time the user’s server will request this document, it will receive the xml file with all the pings received by PTSW since its last request.

This is a major improvement since if the user’s web server was down for 2 days, for some reason, it won’t lose any pings since PTSW will send him all the pings received by the service in the last 2 days.

Note: all current Ping the Semantic Web ping consumers have to create an account and change their application accordingly.

3. Faster pinging infrastructure

The web service is now hosted on a much bigger server. We also switched from MySQL to Virtuoso. These changes result in a more powerful service that I estimate to be able to handle up to 5 million pings per day (in the best of the World with fast remote web server delivering the RDF content). In any case, it is probably enough for the next year’s expansion.

4. Brand new user interface

We also spent some time refreshing the user interface of the web service. This new interface will help us to easily integrate new features and sections to the service’s web site along with keeping it appealing to users.

5. New statistics

New statistics on the state of the service are now available.

  1. All stats about Namespaces. This is the list of namespaces used to describe entities in RDF. For example, if a RDF document has an entity types as a sioc:Post, then the SIOC namespace will be added and its stat counter will be incremented by one. There is currently 347 used namespaces know by PTSW.
  2. All stats about Types. This is the number of typed entities defined in each RDF document know by PTSW. For example, if a RDF document has four foaf:Person defined, then four will be added to the counter. If the same entity (URI) is defined in two different RDF documents, the type of the entity will be calculated twice. So take these numbers as a good approximation, but not as an absolute truth. There is currently 2773 types know by PTSW.

Some people will notice that the current numbers in the sidebar are completely different from the ones that were on the old website of the service. They are right, and there is the reason: I pruned the geonames.org and talkdigger.com pings from PTSW.

In fact, when I started the web service, I added these two RDF data dumps to PTSW. At that time, initiatives such as the Linking Open Data Community didn’t exist and people didn’t know how to export their RDF data dumps. So I choose to include them in the PTSW system. Since then, methods improved and things changed. Now RDF data dumps are available directly from these web sites, data dump repository exists, and people don’t use PTSW for that reason. In fact, they use the PTSW exportation feature to synch their service, and not to get complete datasets from them. This said, I pruned all these 7 000 000 documents from the system leaving about 845 000 “wild” RDF documents in the system.

It is the inclusion of these complete data sources that were increasing the stats compared to today’s stats.

Conclusion

When I created Ping the Semantic Web more than one year ago, I hoped developers would use the service to easily find RDF data without crawling the entire. I hoped that this web service would be a vector of semantic web application development. I think that it succeeded in some ways when I think about services such as SIndice and DOAPStore that emerged from the PTSW initiative.

This new version of Ping the Semantic Web tries to go further in that directing: making thinks even simpler for RDF data consumer and giving them a more powerful RDF discovery service.

Note: make sure to refresh the DNS cache of your desktops and servers so that you see the new, and not the old, PTSW web site.

Describing Documents, Articles, Series, Volumes and Conferences using the Bibliographic Ontology

The Bibliographic Ontology let you describe all these things, and much more, in RDF. In the last months the community developing BIBO has been quite fruitful. Many questions have been asked, many have been answered, and things are slowly getting shape.

It is for that reason that I started to create some more examples using the ontology; trying to see how people will use it; etc. I created some examples to see if I could easily describe two articles I wrote in the past few years: (1) and accepted article in a proceeding and (2) a refused article submitted for a conference. I was wondering if the current state of the ontology could easily cope with some weird cases. As you will notice bellow, it nicely described some weird cases that I encountered while describing these articles.

First example: Describing a Series, with volumes and articles

I wanted to describe an article I wrote with Uldis Bojars, Alexandre Passant and John Breslin. This article is part of a proceeding that is published in a series, as a volume (248). The series have a ISSN; however it is only published online (no paper is version available).

There is how BIBO describe such a case:

A Complex series + proceeding + article use case in RDF/XML

The series is a bibo:Series. This series has a title, a short title and a ISSN. Also, it is in relation with its publisher and has a status (published). Finally, this series is put in relation with its volume and a web document (a web page) that is a manifestation of the series.

This is something to have in mind for the remaining of this blog post: in BIBO, a web page is a document, like any other document. The only difference between a paper book and a webpage is their identifier(locator): a published paper book will have a ISBN, and a web page will have a URL. This said, we easily relates different documents’ formats using dcterms:relation. That way, we explicit a relation between two different documents (event if they only difference is their format (printer on paper, html, pdf, etc)).

After I described the proceeding that has been published. It is a bibo:Proceeding that has some properties, but particularly a bibo:volume property that describe its location into the series. Finally, the editors of the proceeding are described and are related to the proceeding they edited via a bibo:Contribution.

Contributions are at the core of the ontology; they are defined as:

“The contribution a person, group or organization makes to the creation or realization of a work.”

So, an editor and an author are contributors to the creation or realization of a work (a document).

Finally I described the article that is a bibo:Article. I described its properties, its authors, and the relation between the authors and the article. I also described its status: it has been peer-reviewed and has been published.

The links between the series, the proceeding and the article has been done by re-using the properties dcterms:hasPart and dcterms:isPartOf.

Second example: a rejected article submitted to a conference

For that second example, I wanted to describe an article I wrote a couple of years ago, that I submitted to a conference and that has been rejected. So, I had to describe the article, the conference, and the fact that it has been rejected after peer-reviewing.

There is how BIBO describes this use case:

Rejected article submitted to a conference in RDF/XML

This is basically the same thing has the above: describing a document with its authors.

However, in that case, I had to describe a conference. The Bibliographic Ontology use The Event Ontology to describe such things. The conference event has been described using the even:Event class, along with event:agent that relates the event with the organization that created the event and event:place that locates the event in the World.

However, the description of conference events will change in the next few weeks since Yves Raimond and me will create an extension module to this ontology to specifically describes conference events (so, we will talk about event:Conference, and event:organizer and event:sponsors, etc.).

Finally, I had something to say about this article I wrote. To say it, I created another type of document called a bibo:Note to annotate this document with some comments. A bibo:Note is a document of its own, like a bibo:Article. However, I relates the two documents (the bibo:Note and the bibo:Article) using the bibo:annotates property. That way, I describe the fact that a document is an annotation to another document.

Conclusion

These two examples explain how The Bibliographic Ontology can be used to describe some complex bibliographic use cases. It is just a start, and many questions are yet to be answered by the bibliographic ontology. However, many things are going forward and if you have been interested by this demonstration, I can only suggest you to join the community supporting BIBO’s development and help it evolving.

The Music Ontology revision 1.12

The Music Ontology is much easier to read with the new documentation and the normalized terms. In fact, Yves Raimond worked hard on this new release with some help from Chris, me and other people on the mailing list.

The list of major changes is available on Yves’s blog post about the release. Also, the complete list of changes is available in the change log.

Some things have yet to be finished related to this new revision. We have to update the examples on the wiki. And I have to modify the Musicbrainz RDF view that the generated RDF documents reflect these changes.

Finally this new revision is a major upgrade related to the user-friendliness of the ontology. Terms, descriptions and documentation of the ontology should be much clearer now.