This is what I wanted to participate to.
The Open Library is a project that wants to archive information about every book (probably writings) created by mankind. Such a strong vision is naturally closely related to the semantic web.
I contacted Aaron Swartz about this project. I wanted to know what were their plans about making all this data available on the semantic web; what was their plan to describe these books into RDF.
I wanted to participate to the project by describing their information into RDF using the Bibliographic Ontology.
So it is what I started to do. Aaron sent me some snapshots of data using their current database schema (this schemas should be updated soon). Then I described one of them using BIBO. As you will see bellow, the ontology neatly describes the Open Library data and enable us to query, at the same time, the Open Library’s data, the data about the articles I wrote, eventually the Zotero citations if they choose to use BIBO, etc.
So, bellow is my proposition to Aaron and to the Open Library Project. From this post, we will be able to discuss about the implications, how this could be done, how the data could be made available for querying and browsing, etc.
How to Cook Revised Edition described using RDF and BIBO
The RDF/N3 example is available here: How to Cook Revised Edition in RDF/N3
Describing this data using BIBO leaded me to find out how to describe topical subjects of documents. It is a discussion we (the BIBO development community) already had, and here I think I found a solution.
Describing topical subjects for a bibo:Document
The goal is to relate a document resource with the concepts describing their topics. There are many ways to describe subjects of documents: it could be with a literal, a class, an individual, etc.
What I am proposing here is to re-use the dcterms:subject property (has we already do) to relate a bibo:Document with the concept of a taxonomy that will acts has the topical subject of a document.
The Open Library is using the BISAC subject standard to relate books with their topics. What I have done is to describe the BISAC standard as a taxonomy in RDF using SKOS. The resulting RDF is: BISAC taxonomy snapshot.
As you can notice, the BISAC taxonomy structure is well-described using SKOS concepts. The relation between these concepts is described as well. Also, the dcterm:identifier property is used to link a concept with its BISAC identifier.
From there, we only have to use the BISAC URIs to link a bibo:Document to its subjects like:
dcterms:subject <http://purl.org/ontology/bibo/bisac#Cooking_Regional_and_Ethnic_American_General> ;
dcterms:subject <http://purl.org/ontology/bibo/bisac#Cooking_General> ;
This is simple and effective. Also, we are not limited to the BISAC taxonomy; one can use the taxonomy he wants to describe subjects of its documents.
Some SPARQL queries
Nothing is better than SPARQL queries to “feel” the power of these RDF descriptions.
The following query will display the documents’ title and the contribution role of Raymond Sokolov. So, if Raymond contributed to some documents as an author and editor, and all these documents will be returned in the resultset:
Finding documents where Raymond Sokolov contributed
The following query is a variable of the above. It will returns all the documents’ title where Raymond is an author.
Eventually we could also use the bibo:position to know all the documents wrote by Raymond where its author position if less than 2 (so, where he is a primary or secondary author of a document).
Queries related to documents and their subjects
If a user only has the BISAC identification number of a concept, and that he needs to find books about this topic, then he only has to run this query to get the titles with that topic:
However, it is not really handy. What if I only want books about “cooking”? There is a way to go:
That way, you will get all the “cooking” related concepts from the taxonomy, and you will find all the related books.
Note that there are many other ways to go such as browsing the graph of concepts using the skos:narrower and skos:broader properties from a given skos:Concept. However, the query above is simple and effective.
Otherwise you can create a full set of other simple and effective queries by searching all the published books, all the published books by a given author or editor, etc.
There is no limit when all that information is available in RDF and BIBO.
More descriptions of the Open Library using BIBO
If you take a closer look at the current database schemas of the Open Library Project, you will notice that have data about “series”, “notes”, and other things. I don’t have such an example in hands at the moment, but we have to keep in mind that we can easily describe them using BIBO as well.
I described how RDF and The Bibliographic Ontology could be use to describe data from The Open Library Project. Doing this would enable them to easily and effectively publish their data so that other people and applications could take advantage of it.
We also found that it is a powerful method that we can easily use to search complex graphs of relations created by such data described in RDF and BIBO.
Finally, having all this data available in BIBO will enable us to easily merge it with other document data sources such as Zotero or any other writings described using RDF and BIBO. As a final example, we could, for example, find all the documents that Raymond Sokolov contributed to create, as an author, and editor, or whatever. With a single query, once could find out that he wrote some published books, and that he authored some posts on its blog. All that thanks to the RDF, BIBO, SPARQL and all the data sources exporting their data using RDF and BIBO.