Archive for the 'Bibliographic Ontology' Category

The Bibliographic Ontology 1.0

Print This Post Print This Post
After months of development and nearly 1000 messages on the mailing list exchanged between 83 participants, the first version of The Bibliographic Ontology has just been published.

This is an important milestone for this project. It has been postponed weeks after weeks to make sure that it was expressive enough to handle all kind of scenarios for all kind of bibliographic projects. We finally reached a consensus and published the first version of this ontology.

I am quite pleased to release it after nearly one year of development. We have a solid basis that can easily be extended to cope with more specialized bibliographic needs. We already know some projects (such as Zotero; thanks Connie) that are planning to use BIBO to describe things related to documents and collection of documents in RDF.

Ontology Resources

Many resources exist to help people to use this ontology to describe bibliographic things.

  • Ontology documentation - is the human readable documentation of the ontology.
  • Ontology description - is the RDF+N3 description of the ontology. (note: all URIs are dereferencable)
  • Mailing list - is the place where people ask questions about how to use the ontology; where people suggest extensions to the ontology; and where people report potential issues.
  • Wiki - is the place where to archive references, write examples and write other stuff related to the ontology.
    • Examples - It is the place where to write BIBO examples.
  • Google Code Repository - is the place where to download the latest version of the working draft of the ontology. Additionally, people can download tools related to the ontology.

Conclusion

I would like to thank everybody that participated to the mailing list and the wiki. Many people put much time and thinking into this ontology, and this release won’t have been possible without their professional work, time and thinking. This is a really complex domain and countless hours have been spent on this project. It is not an end; it is just the beginning.

Please send any questions, comments, suggest and report issues on the mailing list.

I would like to personally thanks Bruce, Yves, Patrick, Connie, Elena, Mark, (I am missing others, please forgive me), and all others for making this happen.

Blogs, Wordpress, Zitgist and the Semantic Web

Print This Post Print This Post
rdf-zitgist-wordpress.png Every link has a relation on the Semantic Web. Each time a person create a link from a web page to another web page, it does much more than simply linking… In fact, the Web and the Semantic Web are starting to mesh together.

The meshing is occurring at the level of the URI, or more specifically at the level of the URL if we are talking about the Web. This is what I will show you in this post using a Wordpress plug-in I developed using Zitgist technologies.

Motivations driving the development of the plug-in

The first objective of this project is to try to find out how people could integrate semantic web concepts and principles in the systems they daily use. How can we integrate the semantic web into Blogs for example? Is the use of semantic web technologies only good at publishing content in RDF? This is certainly one thing, but I doubt it is the only one. This is for that reason why I put some time in developing this prototype.

The second motivation is to create a good prototype of a system using Zitgist’s architecture to show people how they can take advantage of Zitgist to develop their projects; to make their vision a reality.

Some background thinking about the plug-in

On the Web, people mainly manipulate web page resources. They locate them on the Web using a unique locator, called a URL. On the semantic web on the other hand, people do not only manipulate documents; they manipulate many kind of Things, many kind of resources. They refer to them using URIs. The difference between a URI and a URL is that a URL is resolvable on the Web, but not necessarily a URI (in fact, a URI is the super-class of a URL). However, best practices suggest people to make URI resolvable (dereferencable) on the Web; in such a case a URI is a URL.

Anyway, all this to say that a URI in the semantic web can be a URL on the Web. There are many use cases emerging from that special digital environment. As an example, many people will use a Wikipedia Wiki Page URL as an URI for a topic, an interest, or for many other relations to these concepts. In such a case, the URL of a webpage is used to refer as a Concept. I don’t want to discuss about the basis of this, but it is a fact, and we have to handle it.

Introduction to the Zitgist Wordpress Plug-in

This plug-in is quite simple in appearance, but has some really interesting results for users.

The only thing this plug-in does, is to show blog readers existing related data for a given URL and, in some case, to enable them to perform actions based on this data.

By example, if I make a link to Tim Berners-Lee’s web page, a user could be interesting in having more information about Tim, directly from the article he is reading. Tim has many data related to him from the semantic web.

timbl.png

That is it. The plug-in display related information to links from a blog post. In this case, it is people Tim knows and Tim’s profile. The information is shown the users using a contextual menu. The data is requested to Zitgist’s systems and is displayed to the user. This is that simple, but how powerful?

The usefulness of the Zitgist Wordpress Plug-in

The plug-in is quite useful in many ways. In fact, it instantly displays related information about a link to readers of the blog. From any blog post, a reader can easily jump to resources related to each link.

Some use cases

Above I said that a URL, a web link, could be much more than it usually appears. So bellow, I show a couple of use cases showing the potential behind the idea.

1. URL as a web page

What happen when a link from a blog post is a URL? Well, some things can happen, and there is an example:

Check it by yourself: The Bibliographic Ontology

bibo.png

Here a user can check the webpage directly, or he can jump to related resources. These related resources come from the semantic web. The first one is the description of the project. The following two are the authors of the ontology. The last resources are documents related to the ontology and the “version” of the ontology.

2. URL as a dereferencable URI

For the non-initiated readers, I would suggest you to read this best practice tutorial explaining how to publish semantic web data on the web.

Sometimes (okay, not that much at the moment, but I hope people will start), people link to resource URIs (so, URL that can be dereferenced to get RDF data about the resource, or its web page representation if available).

Check it by yourself: URI referring to Frederick Giasson

fgiasson.png

The result is that readers have directly access to my profile, articles I wrote, etc.

3. Actionable URL

Sometimes it can be really interesting to be able to act according to some URLs. One example is when a web page, or a resource (identified by a URI) refers to a thing that can be bought. By example: something that can be bought on Amazon.com:

Check it by yourself: Visualizing the Semantic Web

amazon.png

From the blog post, the reader can automatically buy the related resource on Amazon. This is only one possible action, but many others are possible; the only limit is imagination.

Conclusion

The simple links you create from your blog posts to other web pages have much more related information than you can think. Using this prototype Zitgist Wordpress plug-in will explicit these links for your reader.

You only have to read some of my other blog posts to try it by yourself. Some results are quite impressing.

I will make this plug-in available for download sometime next week.

This idea has been promoted by Kingsley Idehen for some time now. He uses to call this idea enhanced anchors, or, a++. The idea is simple: enhancing anchors to explicit links to a certain resource (URI or URL), and optionally to perform some action on them.

This prototype is a first try in that direction. Many upgrades should follow so we really unveil the power of this new kind of linking; of this new way to relate things together, and to explicit these relations. Please report me any bug, issues, cross-browsers problems, comments, suggestions, etc.

The Open Library in RDF using The Bibliographic Ontology

Print This Post Print This Post
openlibrary.png

“What if there was a library which held every book? Not every book on sale, or every important book, or even every book in English, but simply every book-a key part of our planet’s cultural legacy.” — The OpenLibrary Project

This is what I wanted to participate to.

The Open Library is a project that wants to archive information about every book (probably writings) created by mankind. Such a strong vision is naturally closely related to the semantic web.

I contacted Aaron Swartz about this project. I wanted to know what were their plans about making all this data available on the semantic web; what was their plan to describe these books into RDF.

I wanted to participate to the project by describing their information into RDF using the Bibliographic Ontology.

So it is what I started to do. Aaron sent me some snapshots of data using their current database schema (this schemas should be updated soon). Then I described one of them using BIBO. As you will see bellow, the ontology neatly describes the Open Library data and enable us to query, at the same time, the Open Library’s data, the data about the articles I wrote, eventually the Zotero citations if they choose to use BIBO, etc.

So, bellow is my proposition to Aaron and to the Open Library Project. From this post, we will be able to discuss about the implications, how this could be done, how the data could be made available for querying and browsing, etc.

How to Cook Revised Edition described using RDF and BIBO

The current use case is a book by Raymond Sokolov: “How to Cook Revised Edition“. It has been straightforward to map this data into BIBO using the current proposition.

The RDF/N3 example is available here: How to Cook Revised Edition in RDF/N3

Describing this data using BIBO leaded me to find out how to describe topical subjects of documents. It is a discussion we (the BIBO development community) already had, and here I think I found a solution.

Describing topical subjects for a bibo:Document

The goal is to relate a document resource with the concepts describing their topics. There are many ways to describe subjects of documents: it could be with a literal, a class, an individual, etc.

What I am proposing here is to re-use the dcterms:subject property (has we already do) to relate a bibo:Document with the concept of a taxonomy that will acts has the topical subject of a document.

The Open Library is using the BISAC subject standard to relate books with their topics. What I have done is to describe the BISAC standard as a taxonomy in RDF using SKOS. The resulting RDF is: BISAC taxonomy snapshot.

As you can notice, the BISAC taxonomy structure is well-described using SKOS concepts. The relation between these concepts is described as well. Also, the dcterm:identifier property is used to link a concept with its BISAC identifier.

From there, we only have to use the BISAC URIs to link a bibo:Document to its subjects like:

dcterms:subject <http://purl.org/ontology/bibo/bisac#Cooking_Regional_and_Ethnic_American_General> ;
dcterms:subject <http://purl.org/ontology/bibo/bisac#Cooking_General> ;

This is simple and effective. Also, we are not limited to the BISAC taxonomy; one can use the taxonomy he wants to describe subjects of its documents.

Some SPARQL queries

Nothing is better than SPARQL queries to “feel” the power of these RDF descriptions.

Queries related to contributions

The following query will display the documents’ title and the contribution role of Raymond Sokolov. So, if Raymond contributed to some documents as an author and editor, and all these documents will be returned in the resultset:

Finding documents where Raymond Sokolov contributed
The following query is a variable of the above. It will returns all the documents’ title where Raymond is an author.

Finding documents where Raymond Sokolov contributed as an author

Eventually we could also use the bibo:position to know all the documents wrote by Raymond where its author position if less than 2 (so, where he is a primary or secondary author of a document).

Queries related to documents and their subjects

If a user only has the BISAC identification number of a concept, and that he needs to find books about this topic, then he only has to run this query to get the titles with that topic:

Finding documents related to a BISAC identifier

However, it is not really handy. What if I only want books about “cooking”? There is a way to go:

Finding documents about “cooking”

That way, you will get all the “cooking” related concepts from the taxonomy, and you will find all the related books.

Note that there are many other ways to go such as browsing the graph of concepts using the skos:narrower and skos:broader properties from a given skos:Concept. However, the query above is simple and effective.

Other queries

Otherwise you can create a full set of other simple and effective queries by searching all the published books, all the published books by a given author or editor, etc.

There is no limit when all that information is available in RDF and BIBO.

More descriptions of the Open Library using BIBO

If you take a closer look at the current database schemas of the Open Library Project, you will notice that have data about “series”, “notes”, and other things. I don’t have such an example in hands at the moment, but we have to keep in mind that we can easily describe them using BIBO as well.

Conclusion

I described how RDF and The Bibliographic Ontology could be use to describe data from The Open Library Project. Doing this would enable them to easily and effectively publish their data so that other people and applications could take advantage of it.

We also found that it is a powerful method that we can easily use to search complex graphs of relations created by such data described in RDF and BIBO.

Finally, having all this data available in BIBO will enable us to easily merge it with other document data sources such as Zotero or any other writings described using RDF and BIBO. As a final example, we could, for example, find all the documents that Raymond Sokolov contributed to create, as an author, and editor, or whatever. With a single query, once could find out that he wrote some published books, and that he authored some posts on its blog. All that thanks to the RDF, BIBO, SPARQL and all the data sources exporting their data using RDF and BIBO.

Describing Documents, Articles, Series, Volumes and Conferences using the Bibliographic Ontology

Print This Post Print This Post

The Bibliographic Ontology let you describe all these things, and much more, in RDF. In the last months the community developing BIBO has been quite fruitful. Many questions have been asked, many have been answered, and things are slowly getting shape.

It is for that reason that I started to create some more examples using the ontology; trying to see how people will use it; etc. I created some examples to see if I could easily describe two articles I wrote in the past few years: (1) and accepted article in a proceeding and (2) a refused article submitted for a conference. I was wondering if the current state of the ontology could easily cope with some weird cases. As you will notice bellow, it nicely described some weird cases that I encountered while describing these articles.

First example: Describing a Series, with volumes and articles

I wanted to describe an article I wrote with Uldis Bojars, Alexandre Passant and John Breslin. This article is part of a proceeding that is published in a series, as a volume (248). The series have a ISSN; however it is only published online (no paper is version available).

There is how BIBO describe such a case:

A Complex series + proceeding + article use case in RDF/XML

The series is a bibo:Series. This series has a title, a short title and a ISSN. Also, it is in relation with its publisher and has a status (published). Finally, this series is put in relation with its volume and a web document (a web page) that is a manifestation of the series.

This is something to have in mind for the remaining of this blog post: in BIBO, a web page is a document, like any other document. The only difference between a paper book and a webpage is their identifier(locator): a published paper book will have a ISBN, and a web page will have a URL. This said, we easily relates different documents’ formats using dcterms:relation. That way, we explicit a relation between two different documents (event if they only difference is their format (printer on paper, html, pdf, etc)).

After I described the proceeding that has been published. It is a bibo:Proceeding that has some properties, but particularly a bibo:volume property that describe its location into the series. Finally, the editors of the proceeding are described and are related to the proceeding they edited via a bibo:Contribution.

Contributions are at the core of the ontology; they are defined as:

“The contribution a person, group or organization makes to the creation or realization of a work.”

So, an editor and an author are contributors to the creation or realization of a work (a document).

Finally I described the article that is a bibo:Article. I described its properties, its authors, and the relation between the authors and the article. I also described its status: it has been peer-reviewed and has been published.

The links between the series, the proceeding and the article has been done by re-using the properties dcterms:hasPart and dcterms:isPartOf.

Second example: a rejected article submitted to a conference

For that second example, I wanted to describe an article I wrote a couple of years ago, that I submitted to a conference and that has been rejected. So, I had to describe the article, the conference, and the fact that it has been rejected after peer-reviewing.

There is how BIBO describes this use case:

Rejected article submitted to a conference in RDF/XML

This is basically the same thing has the above: describing a document with its authors.

However, in that case, I had to describe a conference. The Bibliographic Ontology use The Event Ontology to describe such things. The conference event has been described using the even:Event class, along with event:agent that relates the event with the organization that created the event and event:place that locates the event in the World.

However, the description of conference events will change in the next few weeks since Yves Raimond and me will create an extension module to this ontology to specifically describes conference events (so, we will talk about event:Conference, and event:organizer and event:sponsors, etc.).

Finally, I had something to say about this article I wrote. To say it, I created another type of document called a bibo:Note to annotate this document with some comments. A bibo:Note is a document of its own, like a bibo:Article. However, I relates the two documents (the bibo:Note and the bibo:Article) using the bibo:annotates property. That way, I describe the fact that a document is an annotation to another document.

Conclusion

These two examples explain how The Bibliographic Ontology can be used to describe some complex bibliographic use cases. It is just a start, and many questions are yet to be answered by the bibliographic ontology. However, many things are going forward and if you have been interested by this demonstration, I can only suggest you to join the community supporting BIBO’s development and help it evolving.

News at Zitgist: the Browser, PTSW, the Bibliographic Ontology and the Query Service

Print This Post Print This Post

It is not because we had some issues with the Zitgist Browser’s server that things stopped at Zitgist. In fact, many projects evolved at the same time and I outline some of these evolutions bellow.

New version of the Zitgist Browser

A new version of the browser is already on the way. In fact, the pre-release version of the browser was a use case; a prototype. Now that we know that it works and that we faced most of the issues that have to be taken into account to develop such a service, we hired Christopher Stewart to work on the next version of the browser. He is already well into the problem now, so you could expect a release of this new version sooner than you could be expecting. At first, there won’t be many modifications at the user interface level, however, many things will be introduced in this new version that will help us to push the service at another level in the future.

New version of Ping the Semantic Web

The version 3.0 of the PingtheSemanticWeb.com web service should be put online next week. It will be a totally new version of the service. It won’t use MySQL anymore; Virtuoso has replaced it. The service will now fully validate RDF files before including them in the index. More stats will be available too. It is much faster (as long as remote servers are fast too) and I estimate that this only server could handle between 5 to 10 million pings per day (enough for the next year’s expansion). This said, the service will be pushed at another level and be ready for more serious traffic. After its release, a daily dump of all links will be produced as well.

The first draft of the Bibliographic Ontology

The Bibliographic Ontology Specification Group is on fire. We are now 55 members and generated 264 posts in July only. Many things are going on here and the ontology is well underway. We should expect to release a first draft of the ontology sometime in August. If you are interested in bibliographic things, I think it’s a good place to be.

The Zitgist Semantic Web Query Service

Finally, Zitgist’s Semantic Web Query Service should be available for alpha subscribed users sometime in September. You can register to get your account here. Also, take a look at what I wrote about vis-ŕ-vis this search module (many things evolved since, but it’s a good introduction to the service).

Conclusion

So, many things are going on at Zitgist and many exiting things should happen this autumn, so stay tuned!

The Bibliographic Ontology: a first proposition

Print This Post Print This Post

This Document is about the creation of The Bibliographic Ontology. It is the first proposition from Bruce D’Arcus and me that should lead to the writing of the first draft of the ontology. Some things have been developed, many questions have been raised, and the discussion that will arise from this first proposition will set the basis for the first draft of the ontology.

The goal of this ontology is simple: creating a bibliographic ontology that will set the basis to describes a document: so describing a writing that provides information. If well done, it will enable other people or organizations to create extension modules that will enable it to be expressive enough to describe more specialized sub-domains such as law documents, etc. It also re-use existing ontologies that already define some properties of documents.

Related materials

1. The proposed OWL/N3 file describing The Bibliographic Ontology (note: read the comment, FG are from me, and BD are from Bruce)
2. An enhanced version of the Zotero RDF dump of the book “Spinning the Semantic Web”, that shows the expressiveness power of the ontology by extending its content using the bibo:Part class and the locators properties (RDF/XML)
3. Other examples that shows other possible descriptions such as the description of events, places, etc.(RDF/N3)

Main concept of the ontology: a Document

The main concept of the ontology is bibo:Document. This class is described as “Writing that provides information” (from Wordnet). So, basically, any writing is a Document. It is equivalent to a foaf:Document and a dcterms:BibliographicResource. These two links are quite important since it will enable us to re-use these two widely used ontologies: FOAF and DCTERMS.

Second main concept: Contributions to these Documents

The second main concept of the ontology is bibo:Contribution. This class is described as “A part played by a person in bringing about a resulting Document”. The goal of this concept is to relate people, by their contributions, to documents they wrote, or helped to write. For now, contributions are defined by three properties:

  1. bibo:role, that defines the role of the contributor: author, translator, publisher, distributor, etc.
  2. bibo:contributor, that links a contribution to its contributor
  3. bibo:position, that loselessly associates a “contribution” level for each contributors. This property is mainly used to sort multiple authors that worked on the writing of a document. More about that in the examples document.

With these two concepts, you can describe any Document and any Contribution to any document. So you can relate any piece of writing to its contributors.

What is really interesting with the concept (in my opinion) is that it opens the door the much more. In fact, by using this concept, we can now extend the idea and describe many more things about how people contributed to the writing of a document.

From these two concepts, we extended the idea to be able to cope with a larger range of use-cases.

Extensions of bibo:Document

The document class has been specialized in a series of more specialized type of documents, with restrictions of their own:

  • Article
  • LegalCase
  • Manuscript
  • Book
  • Manual
  • Legistlation
  • Patent
  • Report
  • Thesis
  • Transcript
  • Note
  • Law

Classes or individuals?

The development of this proposition has been made with Lee W. Lacy’s OWL book quote in mind:

Individuals often mirror “real world” objects. When you stop having different property attributes (and just have different values) you have often identified an object (individual)

This mean that if a subclass of a class didn’t have specific restrictions, or if no properties were restricted by using this class in their domain, then the class was dropped and an individuals of the super-class.

One example of this is the type bibo_types:dissertation. It is an individual of the class bibo:Thesis, but since it doesn’t have anything different other than its meaning, then we created an individual of the class bibo:Thesis. Check the examples document to see what it means concretely.

Collections of documents

Another main concept of the ontology is bibo:Collection. This concept has an aggregation inherent property. Its only purpose is to aggregate bibo:Document(s). An entity of this class will have a role of hubs into the RDF graph (network) created out of bibliographic relations (properties).

Other types of collections, with some restrictions of their own, have also been created. These other collections, such as bibo:CourtReporter are intended to be anchor points that can be extended by Bibliographic Ontology Extension Modules of particular specialized sub-domains such as Law documents.

There is the current list of specialized collections:

  • InternetSite
  • Series
  • Periodical
    • Journal
    • Magazine
    • CourtReporter

Part of Documents

Another important concept is bibo:Part. This concept, along with locators (more about them in the next section), enables us to specify the components of Document. In fact, sometimes documents are aggregated to create collections, such as journals, magazines or court reporters. However, sometimes, documents are embedded within a document (embedded versus aggregated). This is the utility of bibo:Part; a bibo:Part is a document, but in fact, it’s a part of a document. The special property of a bibo:Part is dcterms:hasPart. So, a bibo:Part has use this property to relate it to the document it is part of. Check the examples document to know how bibo:Part can be used.

Locating Parts

To support the concept of Parts, a set of properties, called “locators” have been created. These locator properties will help to describe the relation between a Part and its related Document.

Three of these locators are bibo:volume, bibo:chapter and bibo:page. So, these properties will locate Parts inside documents. For example: a chapter within a book, or a volumne within a document that is a set of volumes.

Check the example about the document “The Art of Computer Programming” by Donald Knuth for a good example of how locators can be used.

This said, we could now think to describe a document by its parts, recursively from its volumes to its pages.

Open questions

  1. Should we develop the ontology such that we can describe the entire workflow that lead to the creation and publication (possibly) of a document? All this workflow would be supported by the FRBR principles. At the moment, all the ontology describes the manifestation of a work, and not the work itself or its expression. Take a look at The Music Ontology (its workflow) to see how it could be done for the bibliographic ontology.
  2. If the creation of classes and individuals of classes the good way to describe type of documents?
  3. Is it the good way, or is there other ways, to describe contributions of people to the elaboration of documents?

Re-used ontologies

  • DCTERMS: re-used to describe main properties of document.
  • FOAF: re-used to describe people and organizations.
  • EVENT: re-used to describe events (example: conferences)
  • TIME: re-used to describe temporal properties
  • wgs84_pos: re-used to describe geographical entities

Conclusion

Please give any feedbacks, suggestions or comments directly on the mailing list of the group that develop this ontology. This group is intended to create an ontology that would create some type of consensus between people and organization working with bibliographical data.

Note: I disabled comment on this post only, to make sure that people comment on the mailing list.

Why another Bibliographic Ontology?

Print This Post Print This Post

This very good question by Peter Mika asked on the Bibliographic Ontology Specification Group yesterday.

So, why? Peter said:

I’ve read Frederick Giasson’s call for this group on PlanetRDF.com. But before getting started on the actual topic of developing an ontology for bibliographies, my question is: why develop a new ontology? What is lacking in SWRC/BuRST or PRISM that this new ontology would add? I’m asking this, because I’m concerned by (even) more fragmentation in this space.

I am not a citations a bibliographic references domain expert. In fact, my knowledge in the domain is somewhat limited. However, my recent blog posts about the integration of Zotero into the semantic web brought a lot of questions related with citations and bibliographic ontologies. Bruce D’Arcus appeared from the Zotero web forum, unsatisfied with current ontologies. Bruce knows a lot about all that stuff: he is a domain expert. So I asked to Bruce if he would be willing to start the development of a new Bibliographic Ontology project that would answer its need. In fact, as I noted on my blog and on the wiki, its needs are applied to real problems: OpenOffice and Zotero.

From there, I put in place the current communication infrastructure to start talking about these problems. In less than 1 day, 17 people subscribed to the mailing list, 11 comments have been posted on my latest blog post, etc.

This tells me that there is a real interest in the question. Why? Possibly because current ontologies doesn’t work well for everybody.

In fact, it wasn’t working well for me neither. When I tried to see what was the bibliographic ontologies landscape when I worked on that problem for Zitgist, I found that it was the jungle. There was so many possible ways to describe them, to describe what was a document, etc. There were no best practice guides, no examples, etc; people were doing anything they wanted. This was rendering the data useless for Zitgist. This is for that exact reason that I am putting time in that initiative right now.

An example to illustrate the problem

I will illustrate the current problem with bibliographic ontologies with the following example:

I gone to the BuRST home page and clicked on one of its example. I then checked the code, I saw some SWRC things… then I tried to dereference the URI of this ontology to get the schema explaining what these properties were. Then I tried to find the properties/classes: they were not there.

I think this simple example explains many the problems out there. There are no consistency, no good doc (I can’t find the good SWRC specification document at the moment), no examples, etc.

Next wave of users

The next wave users for these ontologies aren’t computer scientist students working on some academic projects. The next wave of users for these ontologies are Web developers that has only a basic knowledge of all that stuff. What these people need are good doc, consistent concepts and methods, good examples and a community backing the development of these projects.

This is not what I find right now.

Community driven ontology development

To answer to Peter’s mail, Bruce said:

The first corresponds to a narrow range of academic users (last I looked it wouldn’t work for the humanities or law), and the second is just a series of properties, mostly already covered by DC and maintained by a fairly closed industry group not very interested in RDF.

Later Chris Bizer wrote on my blog:

yes, it would really be nice to have a community-backed ontology for describing publications which is a bit more Semantic-Webby than Dublin Core. So developing a best practice for mixing DC, FOAF, SIOC and the event ontology would really useful.

Once you guys have developed this best practice, we are happy to change the D2R mapping of our DBLP server (http://www4.wiwiss.fu-berlin.de/dblp/) and the RDF book mashup(http://sites.wiwiss.fu-berlin.de/suhl/bizer/bookmashup/index.html) , so that they export RDF according to your best practice.

I think that these two examples describe what is happening. Now people are requesting open communities (could we talk about open-sources communities?) to develop these ontologies.

So why this ontology?

The idea here is to develop yet-another-bibliographic-ontology. But the goal isn’t to re-invent the wheel another time. The goal is to fill-in the blanks, to develop a sort of ontology framework developed in such a way that we can easily plug future extension modules, and to make it interacting easily with already existing ontologies. Yes in RDF you can “theorically” plug everything with everything, but in the reality, this is not that simpler and effective. This new ontology initiative should also act as a “best practices” guide for describing citations and bibliographic references on the Semantic Web for developers that has little knowledge in the semantic web.

This is a question of adoption of the semantic web by Web developers. These people that just don’t have the time to check all these little “fragmented” ontologies wrote in OWL, RDFS or whatever, without too explicit comments, without documentation, examples, etc. This is why microformats are going that well: because there are clear documentation, good examples, etc. Like microformats or not, they got the attention of developers because there is support, docs, examples and a strong community developing them.

Conclusion

So all these projects (the Music Ontology, the Bibliographic Ontology, the Linked-Open-Data community, etc.) make me wondering: now, as I write that, are the challenges that the Semantic Web has to face are more social than technical?

I think this is the time now to show to the World that these things work, and work quite well. Unfortunately for some people, we will have to ask these questions and create communities supervising such ontology developments. Entrepreneurs will tell you that the clients are always right. And the clients of ontologies are developers and they won’t spend their precious time in some bric-a-brac projects.

Finally, what I am proposing here is to create an open-community to supervise the development of an ontology describing citations and bibliographic references. This community will be composed of experts of the domain; companies and organizations that want to use it; developers and hobbyists that has interests in it. And as I said above: “The goal is to fill-in the blanks, to develop a sort of ontology framework in such a way that we can easily plug future extension modules, and to make it interacting easily with already existing ontologies. Yes in RDF you can “theorically” plug everything with everything, but in the reality, this is not that simpler and effective. This new ontology initiative should also act as a “best practices” guide for describing citations and bibliographic references on the Semantic Web for developers that has little knowledge in the semantic web.

The Bibliographic Ontology

Print This Post Print This Post

Zitgist, Bruce D’Arcus, the Zotero team and Michael K. Bergman started a new initiative to develop a new citation and bibliographic references ontology. The idea of that project started a couple of days ago when we tried to find how Zotero could be integrated in a semantic web environment. This brainstorming leaded us to start a new ontology development project: The Bibliographic Ontology.

References

Some things are already in place to start the collaborative development of the ontology:

Starting the development of this ontology

As a starting point of the development of this ontology, we will take the “Citation Oriented Bibliographic Vocabulary” developed by Bruce D’Arcus. It is a start, but as he pointed out in the brainstorming, there are much work to do with it to create a better citations and bibliographic ontology. Also, Bruce wrote an introduction mail about what he has in mind to make it a better ontology, what he thinks we should work on, etc. Have in mind that Bruce has a big background and much experience in the domain of citations and bibliographic references.

Goals

The development of this ontology should be driven by its goals. Bruce outlined some goals for this ontology, and more could be added depending on how people are expecting to use it.

  1. Should be a superset of legacy formats like BibTeX, RIS, and so forth
  2. Must support the most demanding needs in the social sciences, humanities, and law, and those who deal with non-Western languages
  3. The class system must be able to map to the type system in the citation style language I [Bruce] designed. In short, it is not enough to just encode the data: it needs to be able to be formatted according to the often archaic details of citation styles
  4. Should be developer-friendly; I consider examples like DOAP and SKOS to be models here
  5. Behind all of these goals are a more concrete goal: it should be perfect for using in OpenDocument/OpenOffice citation support and should handle Zotero’s needs.

In fact, for the point 5, these systems will be the tests cases for the development of this new ontology. They are the same as Musicbrainz, Magnatune and any musical needs that were the tests cases for the development of the Music Ontology.

Users

Users can be many people or systems. Just to listen a couple of them:

  • OpenDocument/OpenOffice citation system
  • Zotero
  • Zitgist
  • Students or professors in a social science or law department
  • Book selling systems such as Amazon.com, Alibris.com or Abebooks.com
  • Book, journals, etc. publishers
  • Authors

As you can see, many things [people or systems] are potential users of this ontology: from people without computer background to heavy and complexes systems such as Amaon.com Zotero and OpenOffice.

Constraints

Users and goals define the development constraints of that ontology. However, we will try to take the same path as me and Yves Raimond has taken for the development of the Music Ontology: creating many levels of expressiveness for the ontology. These levels will be use depending on the user: does the user need to only describe a simple bibliographic reference? Yes, then he will use the level one. Does the user need to describe a collaborative work aggregating many medium sources like: writings, speeches, and conferences, in many languages and in a special timeframe? Yes, then he will use level three. It has been quite a successful approach in the Music Ontology so we should try it into the Bibliographic Ontology too.

Reuse of existing ontologies

This ontology will probably reuse many existing ontologies. Some of them could be:

  • FRBR: as the basement of the ontology
  • FOAF: as the way to describe authors
  • SIOC: as a way to describe everything related to the social software World: wiki pages, blog posts, mailing list threads, etc.
  • MO: as a way to describe everything related to musical things
  • DC: do I have to say why?
  • Event: as a way to describe some events like workshops, conferences, etc.
  • Timeline: as a way to describe complex temporal frameworks

Conclusion

If you are interested in that new ontology development project, I would suggest you to subscribe to the mailing list as well as creating a user on the Wiki and to start giving your ideas and expertise to develop the Bibliographic Ontology. What is great with that project is that it is already motivated by external projects such as its integration into the OpenDocument/OpenOffice citation support and its use by Zotero for its integration with Ping the Semantic Web and Zitgist.

When Zotero meet the Semantic Web

Print This Post Print This Post

Yesterday I wrote a blog post about how Zotero could be integrated into the semantic web. Today more and more people seem interested into the idea and started to dream about the possibilities. In fact, such an initiative could have a deeper impact than only integrating a tool into an environment. It could certainly have an impact on how people are describing documents, citations and works. It could probably help people managing documents, creating and managing document portfolios, automatically generating bibliographic references, etc. It could even possibly help people and scientists in their daily work. Am I dreaming? The World had been built by dreams.

This morning I started the discussion on Zotero’s web forum and it continued all the daylong.

The Biblio Ontology

First of all, the discussion has been start by Bruce D’Arcus when he raised some possible issues related to documents’ URI and documents’ descriptions.

Snippets of Bruce’s Biblio Ontology are currently used by Zotero to describe citations data in RDF. He is pointing out that much work would have to be done on that ontology to let it be able to handle more issues related to document description on the semantic web.

The Zotero team got interested into the project

Dan Cohen, one of the presidents of Zotero, is quite interested into the project. Few considerations have been raised, but all in all, the project seems possible for everybody.

Next development phase

Now that I know that there is an interest from some people into that project, I think it would be good to start planning the next development phases of this initiative.

I am thinking about proceeding the same way I preceded for the Music Ontology and Musicbrainz. We should start by creating a community around the development of an ontology. If Bruce would be willing, I would suggest taking its ontology as the foundation of the project. From there, we could start thinking about what to change it and how to upgrade this ontology to meet Zetero’s need as well as scientific community’s.

In parallel I could work with the Zotero team to integrate it the Zitgist/PingtheSemanticWeb environment.

Also in parallel, another group would develop the Virtuoso Sponger Metadata Cartridges (the equivalent of Zotero’s Translators) to enable Virtuoso server instances to process the same citation data as Zotero does.

Finally, we would work with Zotero to create another Translator that would dereference URIs to get citations RDF data. This new Translator will be use, as I said in my previous post, to let Zotero be feed by Zitgist’s search results and browsing pages.

Conclusion

This is what is new with that idea. Now we should move on to consolidate the initial phase of the project: the creation of the community’s nucleus. Please leave me an email or a comment on this blog post if you would be interested in participating in that emerging project.