Bibliographic Ontology, Semantic Web, Zitgist

The Bibliographic Ontology

Zitgist, Bruce D’Arcus, the Zotero team and Michael K. Bergman started a new initiative to develop a new citation and bibliographic references ontology. The idea of that project started a couple of days ago when we tried to find how Zotero could be integrated in a semantic web environment. This brainstorming leaded us to start a new ontology development project: The Bibliographic Ontology.


Some things are already in place to start the collaborative development of the ontology:

Starting the development of this ontology

As a starting point of the development of this ontology, we will take the “Citation Oriented Bibliographic Vocabulary” developed by Bruce D’Arcus. It is a start, but as he pointed out in the brainstorming, there are much work to do with it to create a better citations and bibliographic ontology. Also, Bruce wrote an introduction mail about what he has in mind to make it a better ontology, what he thinks we should work on, etc. Have in mind that Bruce has a big background and much experience in the domain of citations and bibliographic references.


The development of this ontology should be driven by its goals. Bruce outlined some goals for this ontology, and more could be added depending on how people are expecting to use it.

  1. Should be a superset of legacy formats like BibTeX, RIS, and so forth
  2. Must support the most demanding needs in the social sciences, humanities, and law, and those who deal with non-Western languages
  3. The class system must be able to map to the type system in the citation style language I [Bruce] designed. In short, it is not enough to just encode the data: it needs to be able to be formatted according to the often archaic details of citation styles
  4. Should be developer-friendly; I consider examples like DOAP and SKOS to be models here
  5. Behind all of these goals are a more concrete goal: it should be perfect for using in OpenDocument/OpenOffice citation support and should handle Zotero’s needs.

In fact, for the point 5, these systems will be the tests cases for the development of this new ontology. They are the same as Musicbrainz, Magnatune and any musical needs that were the tests cases for the development of the Music Ontology.


Users can be many people or systems. Just to listen a couple of them:

  • OpenDocument/OpenOffice citation system
  • Zotero
  • Zitgist
  • Students or professors in a social science or law department
  • Book selling systems such as, or
  • Book, journals, etc. publishers
  • Authors

As you can see, many things [people or systems] are potential users of this ontology: from people without computer background to heavy and complexes systems such as Zotero and OpenOffice.


Users and goals define the development constraints of that ontology. However, we will try to take the same path as me and Yves Raimond has taken for the development of the Music Ontology: creating many levels of expressiveness for the ontology. These levels will be use depending on the user: does the user need to only describe a simple bibliographic reference? Yes, then he will use the level one. Does the user need to describe a collaborative work aggregating many medium sources like: writings, speeches, and conferences, in many languages and in a special timeframe? Yes, then he will use level three. It has been quite a successful approach in the Music Ontology so we should try it into the Bibliographic Ontology too.

Reuse of existing ontologies

This ontology will probably reuse many existing ontologies. Some of them could be:

  • FRBR: as the basement of the ontology
  • FOAF: as the way to describe authors
  • SIOC: as a way to describe everything related to the social software World: wiki pages, blog posts, mailing list threads, etc.
  • MO: as a way to describe everything related to musical things
  • DC: do I have to say why?
  • Event: as a way to describe some events like workshops, conferences, etc.
  • Timeline: as a way to describe complex temporal frameworks


If you are interested in that new ontology development project, I would suggest you to subscribe to the mailing list as well as creating a user on the Wiki and to start giving your ideas and expertise to develop the Bibliographic Ontology. What is great with that project is that it is already motivated by external projects such as its integration into the OpenDocument/OpenOffice citation support and its use by Zotero for its integration with Ping the Semantic Web and Zitgist.

11 thoughts on “The Bibliographic Ontology

  1. There is JeromeDL – a Social Semantic Digital Library developed at Digital Enterprise Research Institute (DERI) in Galway, Ireland.

    It allows extracting metadata in RDF for bibliographic resources using MarcOnt and JeromeDL ontologies. Beside a digital library’s features, JeromeDL gives a user an opportunity to collaborate with others: annotate, evaluate resources, set trust levels for others, bookmark online resources…

    I’m just curiuos if you have heard of it 🙂

  2. Hi Jarek,

    First time I ear of all that stuff, this is part of the brainstorming process 🙂

    However, I would like to have more information about MarcOnt ontology. In fact I can’t find any documentation about it, examples snippets of how it could be used, etc. If something available?

    Also, where if the JeromeDL ontology they talk about in the MarcOnt web pages?




  3. I’m not familar with JeromeDL.

    On MarcOnt, one of the non-goals I suggested was worrying about library standards too much. I was thinking of MARC in particular, which is a quite old flat data model not well-suited to contemporary user needs.

    I think to the degree that we want to borrow from the library world here, it would be FRBR, which has a nice RDF ontology (done by Ian Davis). Nevertheless, I’d want that to be quite subtle and preferably behind-the-scenes, as FRBR is complex!

  4. Hi All,

    Speaking about bibliographic ontologies – we are currently working on one – MarcOnt Ontology (Jarek mentioned it). MarcOnt project is to deliver tools for collaborative ontology development and mapping between ontologies. Our proof of concept will be a bibliographic ontology which will be used in JeromeDL (some old version is used there at the moment 🙂 ).

    The reason why we take MARC21 into account is that it is the most popular format used in “real” libraries at the moment and we have to provide a possibility to port resources from these libraries to JeromeDL without loosing any (or only some) data. I agree that it means some complexity. But this is a point where community should come into place. That’s why we would like to release a realatively simple ontology (simple means capturing bibtex and DC :)) which will be the subject of further development by the community.

    The only thing is that we still have some coding to do with respect to MarcOnt Portal (the tool for ontology development I’ve mentioned) :). Anyway, it would be nice to stay in touch as a lot of your goal overlap with ours 🙂

  5. The problem for me, though, is that both MARC and bibtex (and even DC depending on how you use it) are flat data models. They might be fine for library data (which is sourced from MARC of course) or for some scientific users, but they are really not very semweb-ish ways to model data.

    Practically speaking, they don’t work well for my data needs for example, which are the same data needs as lots of scholarly users in the humanities, social sciences, and law.

    I guess we should keep in mind different requirements and users: the ontology I was working on was not designed for libraries. It was designed for scholars–from a wide range of fields–who are the primary users for both Zotero and the OpenOffice bibliographic project.

  6. Maciej,

    [quote post=”800″]The only thing is that we still have some coding to do with respect to MarcOnt Portal (the tool for ontology development I’ve mentioned) :). Anyway, it would be nice to stay in touch as a lot of your goal overlap with ours :)[/quote]

    Yah certainly that we will stay in touch. I would suggest you to subscribe to the project’s mailing list to be part of the brainstorming process. The exact essence of the ontology is yet to be defined, but Bruce has a good idea of where to go with it.

    In fact, the development process of that ontology is possibly different from others. It is the same one I will apply that the one I have used for the development of the Music Ontology. This project will be bound with already in place systems such as Zotero, OpenOffice and Zitgist. Has Brice said, Zotero and OpenOffice didn’t used already ontologies available for some reason (he know since he is involved in their development) and for Zitgist, there was problems too. So this ontology will re-use classes and properties from as much ontologies as possible, but the core should be different for different need.



  7. Another possible intersection that might be worth bearing in mind is Open Archives Initiative Protocol for Metadata Harvesting. Much of the data it exposes is in DC, but I think OAI-PMH also exposes information to link up resources with the digital repositories in which they sit.

    On the subject of repositories, there’s also the close alliance between the DSPACE repository and the SIMILE project that might be worth peeking at, along with the various ontologies SIMILE has created–their JSTOR, OCLC, and ARTSTOR ontologies might be good places to look for reuse.

    For the ontology to be able to capture the information about and from digital repositories holding a resource would be outstanding, and provide a rich connection between the bibliographies and the online sources.

    Thanks! This is exciting!

  8. Hi Fred,

    yes, it would really be nice to have a community-backed ontology for describing publications which is a bit more Semantic-Webby than Dublin Core. So developing a best practice for mixing DC, FOAF, SIOC and the event ontology would really useful.

    Once you guys have developed this best practice, we are happy to change the D2R mapping of our DBLP server ( and the RDF book mashup( , so that they export RDF according to your best practice.



Leave a Reply