I am pleased to publish some information about mapping of the Musicbrainz relational database data into RDF using the Music Ontology as I promised some time ago. I know that I have been late on this one, but I was waiting after some things to be released before publishing this blog post.
This is the first step we have to do before getting a “physical” RDF dump of the musicbrainz data. This first step is to use a Virtuoso RDF View to view the musicbrainz relation database as a RDF triple store.
Introduction to Virtuoso RDF Views
Carl Blakeley of OpenLink Software Inc. just published a first Virtuoso RDF View tutorial called “Mapping Relational Data to RDF with Virtuoso’s RDF Views“. This article explains how to define RDF Views inside Virtuoso and how they work.
The first step would be to read that document to make sure you understand how the mapping of the Musicbrainz data into RDF has been performed using Virtuoso.
RDF/XML presentation of the mapping
I have written a RDF/XML file explaining where the data came from the Musicbrainz database schemas to create the actual RDF View. This is a good starting point to “feel” how the Music Ontology can be used to express musicals things such as Artists, Bands, Records, Tracks, etc.; and to see how the Musical Created Workflow supporting the Music Ontology is used in that case.
The Musicbrainz RDF View
This is the RDF View enabling the Musicbrainz relational database to be viewed as a RDF source “queriable” using SPARQL. This view will virtualizes the descriptions of mo:MusicArtist, mo:MusicGroup, mo:Records and mo:Tracks; as long as mo:Performance, mo:Signal, mo:Composition, etc.
Using the RDF View
Installing the Musicbrainz Database instance (the quick guide)
The first step is to download the Musicbrainz DB and to install it on a PostgreSQL server instance. Follow these steps.
Note: I will try to make that guide as short as possible, so if there are steps that you don’t understand or doesn’t work for you, please leave a comment on that blog post or send me an email.
Installing Virtuoso
To use the RDF View, you will first have to install the Virtuoso 5.0 on your computer. OpenLink Virtuoso comes in 2 different flavours: Open Source and Commercial. The difference, besides the obvious, is that the commercial versions include Virtual Database functionality, which makes the following step easier, as the relational data may remain in the PostgreSQL database.
Linking PostgreSQL tables to Virtuoso via ODBC
For the Open Source Edition:
With the Virtuoso Open Source Edition 5.0 you will have to export the data from PostgreSQL server and import to Virtuoso native DBMS.
For the Commercial Edition:
Once the Virtuoso instance will be running, open a browser window to access Conductor by going to http://localhost:8890/conductor/. This is a web-based dbms manager like myPhpAdmin but for Virtuoso. You may then use it to attach the tables though ODBC.Note: you should have a PostgreSQL ODBC driver installed to perform the following steps.
You should see the PostgreSQL instance connection in the list. You only have to click on “connect”, put the credentials, and you should get connected the Virtuoso server to the PostgreSQL running instance.
After that click on the “External Linked Objects” to connect the remote PostgreSQL tables with Virtuoso. Take a special look at schemes created by these links. The remote tables should be available via the schema “DB.[ODBC driver name].[remote table name]”
These Musicbrainz tables should be linked into Virtuoso:
track, albumjoin, album, albummeta, artist, artist_relation, artistalias, album_amazon_asin, country, l_album_url, l_artist_artist, l_artist_track, l_artist_url, l_track_track, l_album_album, l_album_artist, l_track_url, language, release, url, puid, puidjoin.
Installing the RDF View in Virtuoso
Before continuing, you will have to make a little modification to the RDF View document. You should replace all the “DB.MO.” string occurrences for “DB.[name of the DSN entry].”. This will specify to the RDF View where to take the relational data (in that case, from a remote PostgreSQL server instance).
Now click on the first item in the left sidebar menu “Interactive SQL (iSQL)”.
The next step is to copy the fixed RDF View code into this iSQL window and the clicking RUN.
After 1 or 2 minutes the view should be defined into Virtuoso.
Testing the view
Now the only thing that you have to do is testing this new RDF View. Use that simple query to make sure that you get triples from the view by running that simple SPARQL query inside iSQL:
sparql
define input:storage virtrdf:MBZROOT
select *
from <http://musicbrainz.org/>
where
{
?s ?p ?o.
};
Now the only thing you have to do is to query this RDF View like if you would query any triple store using SPARQL. Check out the Music Ontology Wiki for some examples of how this RDF graph can be queried.
Conclusions
The RDF View to convert Musicbrainz RDB into RDF is quite interesting on many aspects. First of all, we have a good representation of the Musicbrainz data in RDF using the Music Ontology. But this example also shows precisely how relation data can somewhat easily be converted into RDF.