Semantic Web (RDF) data won’t come from initiatives such as LiveJournal.com and Tribe.net with the exportation of their user profiles into RDF using the FOAF ontology; at least not at first. These initiatives are marginal considering the current state of the Web: billion of web pages where most of them are archived into relational database and generated, on-the-fly, in HTML.
Semantic Web (RDF) data will come from the conversation of relational databases of widely used web software such as WordPress, Mediawiki and phpBB, into RDF using some ontologies. Some methods can be used:
This blog post will show you how we can do the same with your WordPress blog and your Mediawiki wiki using Virtuoso RDF Views.
This is quite powerful: by using these views any WordPress or Mediawiki instance could be queried using SPARQL. Other views could easily be created for phpBB (currently on the way), and virtually any relational database accessible from the Web.
Since developing these views is quick and simple, it makes them certainly one of the best tools to convert current relational data sources into RDF.
WordPress and Mediawiki RDF Views
Mitko Iliev developed these two RDF Views that are using the WordPress and Mediawiki database schemes and convert them into RDF using a RDF View. I added some comments in the code but as you can notice, they are quite simple and intuitive to understand (if you have some knowledge in SPARQL.
Installing these RDF Views
You have 3 possibilities to install these RDF Views.
- If you have the commercial version of Virtuoso you only have to connect the MySQL remote database with Virtuoso via Conductor. That way you will see MySQL databases as if they would be local into Virtuoso.
- If you have the open-source version of Virtuoso you have two choices:
- You make a SQL dump of the MySQL database and import it into Virtuoso.
- You install the upgraded version of WordPress or Mediawiki developed by OpenLink Software. These upgraded versions of WordPress and Mediawiki use Virtuoso as dbms instead of MySQL. These two versions should be making available to the public by OpenLink soon.
The idea here is to give access to the relational data to Virtuoso by using one of these three methods. After that, it is just a matter of sending SPARQL queries against the RDF View.
Querying a MediaWiki instance using SPARQL
I will use that MediaWiki instance to show you a couple of examples. This is a modified version of MediaWiki 1.7 that uses Virtuoso instead of MySQL as dbms. Then we installed the RDF View I talked about above. From that point, we can query this Mediawiki wiki instance using SPARQL. Remember that it is always running in a relational database, but thanks to the RDF View, we can view its data in RDF too!
- Listing all triples from the RDF view: See results
- Listing the names of the Wikis hosted on this server: See results
- Listing the wiki pages of the “DemoWiki” wiki instance: See results
- Listing the wiki pages created by the “demo” user: See results
Etc.
We can endlessly continue like that. What I would suggest you to do is to click on the results you get in these web pages, and to click on the “explore” link. That way, you will jump from node to node and find interesting stuff.
Conclusion
I believe that it is the best way to push people to adopt the semantic web, and all its concepts, as The way to describes things on the Web. Once we will get all that useful data from existing sources (musicbrainz, US census data, geonames, name it) and that people will start to release services using all this data in a useful way, then people will start to generate their content for the semantic web. This is why we should continue in that direction. Many people are already working to convert existing sources of data (relational database, web APIs, etc.) into RDF: the linked-open-data community, Zitgist, OpenLink, and probably many others. I would guess (in fact I am sure) that in one year we would have several billion of triples ready to be searched and browsed by Web users.