The Music Ontology revision 1.12

The Music Ontology is much easier to read with the new documentation and the normalized terms. In fact, Yves Raimond worked hard on this new release with some help from Chris, me and other people on the mailing list.

The list of major changes is available on Yves’s blog post about the release. Also, the complete list of changes is available in the change log.

Some things have yet to be finished related to this new revision. We have to update the examples on the wiki. And I have to modify the Musicbrainz RDF view that the generated RDF documents reflect these changes.

Finally this new revision is a major upgrade related to the user-friendliness of the ontology. Terms, descriptions and documentation of the ontology should be much clearer now.

The Music Data Space

Kingsley is talking about Data Spaces since a long time. But what is a Data Space? Nothing is better than an example to understand something, so I will try to explain you with a single data space that has been created yesterday, the Music Data Space:

mbz_rdfview_uris.jpg

This is the Music Data Space. This Data Space contains information about musical things. These things are described mainly by using the Music Ontology, but also by using other ontologies like FOAF. Finally, things (musical things) belonging to this space are accessible, on the Web, via dereferencable URIs.

So, the Music Data Space is a place where all musical things are defined on the Semantic Web, and accessible via the Web.

That is it, and it is what we created last Monday.

Now, some of you could wonder: why on earth Amazon.com belongs to the Music Data Space?

Amazon.com also belongs to the Music Data Space too!

Amazon.com live in the Music Data space too via their API. In fact, a simple experience with the OpenLink RDF Browser clearly demonstrates that Amazon.com’s data belongs to the Music Data Space too.

Open the RDF Browser by following that link

Now you will visualize RDF information about an album called “Chore of Enchantment”. Take a look at this line:

amazon_asin: http://amazon.com/exec/obidos/ASIN/B00003XAA7/searchcom07-20

Click on the link to Amazon. A window should popup. Select the Get Data Set (dereference) option.

At this point, some magic will happens. In fact, the new information that is displayed in the RDF Browser is coming directly from Amazon.com’s web server.

This is why I assume that Amazon.com belong to the Music Data Space too.

In fact, the Virtuoso Sponger will connect to Amazon.com via their API to get some information about that album. It will convert the data into RDF and will display it to the user via the RDF browser’s interface.

One step further: the JPG file also belongs to the Music Data Space!

Yes! Information about the JPG file, hosted on Amazon.com’s web servers, also belong to the Music Data Space and there is the proof:

Open that same RDF Browser page by following that link

Click on the Image (JPG) representing the cover of this album. A window should popup. Select the Get Data Set (dereference) option.

Check the triples that have been created from this image. The Virtuoso Sponger downloaded the JPG file, it analyzed its header, RDFized everything and sent the information back to the RDF Browser so that the user can see the information available for that image.

Where is the end? I have no idea… probably at the same place where the imagination ends too.

Unifying everything

This is that simple. All data sources (relational databases, remote data accessible via APIs, native rdf data, etc.) are unified together via the Music Data Space. And this Music Data Space is accessible, via URI dereferencing, at http://zitgist.com/music/

Other Data Spaces available

Conclusion

The Music Data Space is the starting point and many other type of data spaces should emerge soon.

Browsing Musicbrainz’s dataset via URI dereferencing

Musicbrainz’s dataset can finally be browsed, node-by-node, using URI dereferencing.

What this mean?

Since the Musicbrainz relational database has been converted into RDF using the Music Ontology, all relations existing between Musicbrainz entities (an entity can be a Music Artist, a Band, an Album, a Track, etc.) are creating a musical relations graph. Each node of the graph is a resource and each arc is a property between two resources. Welcome in the World of RDF.

madonna-rdf-description.jpg

This means that from a resource “Madonna” we can browse the musical relations graph to find other entities such as Records, People, Bands, Etc.

Kingsley, inspired by Diana Ross, said: “URI Everything, and Everything is Cool!

This is cool! Now Diana Ross has her own URI on the semantic web: http://zitgist.com/music/artist/60d41417-feda-4734-bbbf-7dcc30e08a83

Paul McCarney:
http://zitgist.com/music/artist/ba550d0e-adac-4864-b88b-407cab5e76af

The Beatles:
http://zitgist.com/music/artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d

Madonna:
http://zitgist.com/music/artist/79239441-bfd5-4981-a70c-55c3f15c1287

Have their own too!


URIs for Musical Things

These URIs are not only used to refer to Musicbrainz entities. In fact, these URIs are used to refer to any Musical Entities that you can describe using the Music Ontology. In a near future, the Musicbrainz data will be integrated along with data from Jamendo and Magnatune. In the future, we will be able to integrate any sort of musical data at the same place (radio stations data, user foaf profiles relations to musical things, etc.). So from a single source (http://zitgist.com/music/) all these different sources of musical data will be queriable at once.

mbz-magnatune-jemendo-rdf.jpg

URI schemes

The URI schemes are defined in the Musicbrainz Virtuoso RDF View:

  • http://zitgist.com/music/artist/*******
  • http://zitgist.com/music/artist/birth/*******
  • http://zitgist.com/music/artist/death/*******
  • http://zitgist.com/music/artist/simlink/*******
  • http://zitgist.com/music/record/*******
  • http://zitgist.com/music/performance/*******
  • http://zitgist.com/music/composition/*******
  • http://zitgist.com/music/musicalwork/*******
  • http://zitgist.com/music/sound/*******
  • http://zitgist.com/music/recording/*******
  • http://zitgist.com/music/signal/*******
  • http://zitgist.com/music/track/*******
  • http://zitgist.com/music/track/duration/*******

All these URI schemes terms refer to their Music Ontology classes’ descriptions.

Conclusion

I am getting closer and closer to the first goal I set to myself when I first started to write the Music Ontology. This first goal was to make the Musicbrainz relational database available in RDF on the Web. Months later and with the help of the Music Ontology Community (specially Yves Raimond that worked tirelessly on the project) and the OpenLink Software Inc. Team, we finally make this data available through URI dereferencing.

From there, we will build-up new music services, integrate more musical datasets into the Music Data Space, etc. It is just the beginning of something much bigger.

Free text search on Musicbrainz literals using Virtuoso RDF Views

I introduced a Virtuoso RDF View that maps the Musicbrainz relational database into RDF using the Music Ontology a couple of weeks ago. Now I will show some query examples evolving a special feature of these Virtuoso RDF Views: full text search on literals.

How RDF Views work

A Virtuoso RDF View can be seen as a layer between a relational database schemas and its conceptualization in RDF. The role of this layer is to convert relation data in its RDF conceptualization.

That is it. You can see it as a conversion tool or as a sort of lens to see RDF data out of relation data.

How full text search over literals works

Recently OpenLink Software introduced the full text feature of their Virtuoso’s SPARQL processor with the usage of the “bif:contains” operator (it is introduced into the SPARQL syntax like a FILTER).

When a user sends a SPARQL query using the bif:contains operator against a Virtuoso triple store, the parser will use the triple store’s full text index to perform the full text search over the queried literal.

With Virtuoso RDF View, instead of using the triple store’s full text index, it will use the relational database’s full text index (if the relational database is supporting full text indexes, naturally).

Some queries examples

In this section I will show you how the full text feature of the Virtuoso RDF Views can be used to increase the performance of a query against the Musicbrainz RDF View modeled using the Music Ontology

Note: if the system asks you for a login and a password to see the page, use the login name “demo” and the password “demo” to see the results of these SPARQL queries.

Example #1

A user remember that first name of the music artist is Paul, and he remember that one of the albums composed by this artists is Press Play. So this user wants to get the full name of this artist with the following SPARQL query:

sparql
define input:storage virtrdf:MBZROOT
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
SELECT ?artist_name ?album_title
FROM <http://musicbrainz.org/>
WHERE
{
?artist rdf:type mo:SoloMusicArtist .
?artist foaf:name ?artist_name .
?artist mo:creatorOf ?album .

?album rdf:type mo:Record .
?album dc:title ?album_title .

FILTER bif:contains(?artist_name, “Paul”) .
FILTER bif:contains(?album_title, “Press and Play”) .
};

Results of this query against the musicbrainz virtuoso rdf view

As you can notice with that query, the user will use the full text capabilities of Virtuoso over two different literals: the objects of these two properties foaf:name and dc:title.

Example #2

In this example, the user wants to know the name of the albums published by Madonna between 1990 and 2000. The answer to this question is returned by the following SPARQL query:

sparql
define input:storage virtrdf:MBZROOT
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
PREFIX dcterms: <http://purl.org/dc/terms/>
prefix dc: <http://purl.org/dc/elements/1.1/>
SELECT DISTINCT ?albums_titles ?creation_date
FROM <http://musicbrainz.org/>
WHERE
{
?madonna rdf:type mo:SoloMusicArtist .
?madonna foaf:name ?madonna_name .
FILTER bif:contains(?madonna_name, “Madonna”) .

?madonna mo:creatorOf ?albums .
?albums rdf:type mo:Record .
?albums dcterms:created ?creation_date .
FILTER ( xsd:dateTime(?creation_date) > “1990-01-01T00:00:00Z”^^xsd:dateTime ) .
FILTER ( xsd:dateTime(?creation_date) < “2000-01-01T00:00:00Z”^^xsd:dateTime ) .
?albums dc:title ?albums_titles .
};

Results of this query against the musicbrainz virtuoso rdf view

Here the user will use the full text capabilities of the Virtuoso RDF Views to find artists with the name Madonna and he uses two filters on xsd:dateTime objects to find the albums that have been created between 1990 and 2000.

Examples #3

In this last example, the user wants to know the name of the members of the music group U2.

sparql
define input:storage virtrdf:MBZROOT
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX mo: <http://purl.org/ontology/mo/>
SELECT ?band_name ?member_name
FROM <http://musicbrainz.org/>
WHERE
{
?band rdf:type mo:MusicGroup .
?band foaf:name ?band_name .
?band_name bif:contains ‘”U2″‘ .
?band foaf:member ?members .
?members rdf:type mo:SoloMusicArtist .
?members foaf:name ?member_name .
};

Results of this query against the musicbrainz virtuoso rdf view

Here the user will use the full text feature to get the name of the music group, then the name of the members related to this (these) music group(s) will be returned as well.

Special operators of a full text search

Some full texts operators can be used in the literal parameter of the bif:contains clause. The operators are the same used in the full text feature of Virtuoso’s relational database. A list and a description of the operators can be found on that page.

I would only add that the near operator is defined as +/- 100 chars from the searched literal. And the wildcard ‘*’ operator should at least be placed after the third character of the literal. So, “tes*t” or “tes*” or “test*” are legal usages of the wildcard operator, but “*test”, “t*” or “te*st” are illegal usages of the operator.

Conclusion

Finally, as you can see, the full text feature available with the Virtuoso RDF Views is a more than essential feature that people should use to increase the performance of their SPARQL queries. The only two other options they have are: (1) using a normal “literal” that as to be well written and with the good cases; in one word this option render such queries useless and (2) they can use a FILTER with a regular expression with the “I” parameter that is far too slow for normal usages.

Musicbrainz Relation Database mapped in RDF using the Music Ontology

I am pleased to publish some information about mapping of the Musicbrainz relational database data into RDF using the Music Ontology as I promised some time ago. I know that I have been late on this one, but I was waiting after some things to be released before publishing this blog post.

This is the first step we have to do before getting a “physical” RDF dump of the musicbrainz data. This first step is to use a Virtuoso RDF View to view the musicbrainz relation database as a RDF triple store.

Introduction to Virtuoso RDF Views

Carl Blakeley of OpenLink Software Inc. just published a first Virtuoso RDF View tutorial called “Mapping Relational Data to RDF with Virtuoso’s RDF Views“. This article explains how to define RDF Views inside Virtuoso and how they work.

The first step would be to read that document to make sure you understand how the mapping of the Musicbrainz data into RDF has been performed using Virtuoso.

RDF/XML presentation of the mapping

I have written a RDF/XML file explaining where the data came from the Musicbrainz database schemas to create the actual RDF View. This is a good starting point to “feel” how the Music Ontology can be used to express musicals things such as Artists, Bands, Records, Tracks, etc.; and to see how the Musical Created Workflow supporting the Music Ontology is used in that case.

The Musicbrainz RDF View

This is the RDF View enabling the Musicbrainz relational database to be viewed as a RDF source “queriable” using SPARQL. This view will virtualizes the descriptions of mo:MusicArtist, mo:MusicGroup, mo:Records and mo:Tracks; as long as mo:Performance, mo:Signal, mo:Composition, etc.

Using the RDF View

Installing the Musicbrainz Database instance (the quick guide)

The first step is to download the Musicbrainz DB and to install it on a PostgreSQL server instance. Follow these steps.

Note: I will try to make that guide as short as possible, so if there are steps that you don’t understand or doesn’t work for you, please leave a comment on that blog post or send me an email.

Installing Virtuoso

To use the RDF View, you will first have to install the Virtuoso 5.0 on your computer. OpenLink Virtuoso comes in 2 different flavours: Open Source and Commercial. The difference, besides the obvious, is that the commercial versions include Virtual Database functionality, which makes the following step easier, as the relational data may remain in the PostgreSQL database.

Linking PostgreSQL tables to Virtuoso via ODBC

For the Open Source Edition:

With the Virtuoso Open Source Edition 5.0 you will have to export the data from PostgreSQL server and import to Virtuoso native DBMS.

For the Commercial Edition:

Once the Virtuoso instance will be running, open a browser window to access Conductor by going to http://localhost:8890/conductor/. This is a web-based dbms manager like myPhpAdmin but for Virtuoso. You may then use it to attach the tables though ODBC.Note: you should have a PostgreSQL ODBC driver installed to perform the following steps.

You should see the PostgreSQL instance connection in the list. You only have to click on “connect”, put the credentials, and you should get connected the Virtuoso server to the PostgreSQL running instance.

After that click on the “External Linked Objects” to connect the remote PostgreSQL tables with Virtuoso. Take a special look at schemes created by these links. The remote tables should be available via the schema “DB.[ODBC driver name].[remote table name]”

These Musicbrainz tables should be linked into Virtuoso:

track, albumjoin, album, albummeta, artist, artist_relation, artistalias, album_amazon_asin, country, l_album_url, l_artist_artist, l_artist_track, l_artist_url, l_track_track, l_album_album, l_album_artist, l_track_url, language, release, url, puid, puidjoin.

Installing the RDF View in Virtuoso

Before continuing, you will have to make a little modification to the RDF View document. You should replace all the “DB.MO.” string occurrences for “DB.[name of the DSN entry].”. This will specify to the RDF View where to take the relational data (in that case, from a remote PostgreSQL server instance).

Now click on the first item in the left sidebar menu “Interactive SQL (iSQL)”.

The next step is to copy the fixed RDF View code into this iSQL window and the clicking RUN.

After 1 or 2 minutes the view should be defined into Virtuoso.

Testing the view

Now the only thing that you have to do is testing this new RDF View. Use that simple query to make sure that you get triples from the view by running that simple SPARQL query inside iSQL:

sparql
define input:storage virtrdf:MBZROOT
select *
from <http://musicbrainz.org/>
where
{
?s ?p ?o.
};

Now the only thing you have to do is to query this RDF View like if you would query any triple store using SPARQL. Check out the Music Ontology Wiki for some examples of how this RDF graph can be queried.

Conclusions

The RDF View to convert Musicbrainz RDB into RDF is quite interesting on many aspects. First of all, we have a good representation of the Musicbrainz data in RDF using the Music Ontology. But this example also shows precisely how relation data can somewhat easily be converted into RDF.