Since the last couple of years, I was constantly reminding myself to put all the titles, authors and ISBN of my library in a database so that I could say to my insurances: there is a the list of books I had prior this thing that happen in my apartment that destroyed all of them. Considering the time that it would have taken, I always pushed this work for later.
Then recently this thought restarted the haunt me, so I asked to my girlfriend: would you like to do this for me before we move to the other apartment? So naturally, she said yes 🙂 So I explained her how we would proceed to save some time and to archive the maximum number of information about these books.
I told her: “take this laptop and open FireFox. You notice the small “Z” icon in the lower right corner of the screen? This is Zotero; this software will save you much time to get the work done.” Naturally, she was dubious.
So I told her to go to Amazon.com, to get the ISBN of each book, to get to the Amazon web page of the book, and to click on the Zotero icon to save the information about the book. So, in 2 clicks, we were saving all the information describing each book: its title, its authors, its ISBN, its publisher, etc. It was taking about 30 seconds per book.
With that procedure, we archived all the information about the books in my library in about 5 hours.
Then I told me: fantastic, now I even have all this information in RDF, thanks to Zotero! I had to do something with that, so I put the Zotero RDF exportation file into a Virtuoso triple store. In less than a minute I had all the information about my books inside a triples store, ready to be queried with SPARQL.
Querying and browsing my book library
- There is the list of the books of my library and their authors. You will notice a “book_uri” and a “author_uri” column. By using these columns, you will be able to browse information about each book and about each author of the list. The only thing you have to do is clicking on the “link”, then a contextual window will appear and then you click on the “Explore” link. That way, you can browse information about each resource. If you want to come back to the previous window (results), you only have to click on the “small-left-array” in the top of the page.
- There is the list of books and their publisher.
- My library is composed of 115902 pages. This query is possible thanks to the aggregates capabilities of Virtuoso’s SPARQL parser.
Linking books authors’ quotes to each book of the library
Then I wanted to know what the authors that wrote the books in my library already said. So I took the QuotationsBook.com quotes database and I linked it to the information I have about my library.
It is why you can read some quotes of Nietzsche at that web page. (note that I created the totally random “foaf:quote” property to add the quotes into the Zotero’s author resource)
Getting more information about the authors
Then, I needed to get more information about the authors I read. To get that additional information, I linked the information I have about the authors in the library with the dbpedia (rdf version of wikipedia) database.
The result is quite impressive. Go to the books/authors page. Then, click on the Nietzsches’s URI (rdf:#$kajXe; it’s the first line in the result table). Then click on the “Explore” link once the contextual window appears. From there, you will see a “sameAs” property, so click on the http://dbpedia.org… link. Then click on the “Get Data Set (Dereference)” link once the contextual window appears. That is it; you get all the information, available on Wikipedia, related to this author in my library.
Then, I know that Nietzsche is born 1844 and died in 1900, etc. So now, I can browse this new and enhanced dataset to know facts about authors I read in the past.
The idea here is to say that the author described by Zotero (in the exportation RDF file created by the software) is the same as the one in dbpedia. So, knowing that the entity defined in Zotero is the same as the one defined in dbpedia, tell us that the facts about the first and the later are true for both entities (because the reality is that both entities (different URIs) are the same).
From there, we can think about integrating the current data with any other type of datasets. One of them could be a geographical dataset such as Geonames.org.
In fact, we know that Nietzsche is death in the city of Weinar. So, if we link the goenames dataset with the dbpedia dataset (it is supposed to be done, but it seems that some things changed in the dbpedia dataset and that the links are no longer available; anyway, it can easily be done), we could have much more information about the place where Nietzsche is dead.
So, as you can see, in a couple of hours, I have been able to digitalize my library. Then, I have been able to get quotes by the author of each of my book. Then, I have also been able to get more information about each author I read.
This is really fantastic. That way, I only have to browse this new dataset to find new facts about authors and books that I didn’t know before, and that would have took me days to find (for my entire library). Thanks to the semantic web, everything has been possible in only a couple of hours.
We could push the experience even further and displaying on a map where the authors of my books are born. So, I could find where most of the authors I read in my life are born. Do I mostly read books wrote in Europe, United-States or Canada? Where a part of my knowledge came from? From where part of the World I have been influenced? Etc.