Semantic Web, Bibliographic Ontology

The Bibliographic Ontology: a first proposition

This Document is about the creation of The Bibliographic Ontology. It is the first proposition from Bruce D’Arcus and me that should lead to the writing of the first draft of the ontology. Some things have been developed, many questions have been raised, and the discussion that will arise from this first proposition will set the basis for the first draft of the ontology.

The goal of this ontology is simple: creating a bibliographic ontology that will set the basis to describes a document: so describing a writing that provides information. If well done, it will enable other people or organizations to create extension modules that will enable it to be expressive enough to describe more specialized sub-domains such as law documents, etc. It also re-use existing ontologies that already define some properties of documents.

Related materials

1. The proposed OWL/N3 file describing The Bibliographic Ontology (note: read the comment, FG are from me, and BD are from Bruce)
2. An enhanced version of the Zotero RDF dump of the book “Spinning the Semantic Web”, that shows the expressiveness power of the ontology by extending its content using the bibo:Part class and the locators properties (RDF/XML)
3. Other examples that shows other possible descriptions such as the description of events, places, etc.(RDF/N3)

Main concept of the ontology: a Document

The main concept of the ontology is bibo:Document. This class is described as “Writing that provides information” (from Wordnet). So, basically, any writing is a Document. It is equivalent to a foaf:Document and a dcterms:BibliographicResource. These two links are quite important since it will enable us to re-use these two widely used ontologies: FOAF and DCTERMS.

Second main concept: Contributions to these Documents

The second main concept of the ontology is bibo:Contribution. This class is described as “A part played by a person in bringing about a resulting Document”. The goal of this concept is to relate people, by their contributions, to documents they wrote, or helped to write. For now, contributions are defined by three properties:

  1. bibo:role, that defines the role of the contributor: author, translator, publisher, distributor, etc.
  2. bibo:contributor, that links a contribution to its contributor
  3. bibo:position, that loselessly associates a “contribution” level for each contributors. This property is mainly used to sort multiple authors that worked on the writing of a document. More about that in the examples document.

With these two concepts, you can describe any Document and any Contribution to any document. So you can relate any piece of writing to its contributors.

What is really interesting with the concept (in my opinion) is that it opens the door the much more. In fact, by using this concept, we can now extend the idea and describe many more things about how people contributed to the writing of a document.

From these two concepts, we extended the idea to be able to cope with a larger range of use-cases.

Extensions of bibo:Document

The document class has been specialized in a series of more specialized type of documents, with restrictions of their own:

  • Article
  • LegalCase
  • Manuscript
  • Book
  • Manual
  • Legistlation
  • Patent
  • Report
  • Thesis
  • Transcript
  • Note
  • Law

Classes or individuals?

The development of this proposition has been made with Lee W. Lacy’s OWL book quote in mind:

Individuals often mirror “real world” objects. When you stop having different property attributes (and just have different values) you have often identified an object (individual)

This mean that if a subclass of a class didn’t have specific restrictions, or if no properties were restricted by using this class in their domain, then the class was dropped and an individuals of the super-class.

One example of this is the type bibo_types:dissertation. It is an individual of the class bibo:Thesis, but since it doesn’t have anything different other than its meaning, then we created an individual of the class bibo:Thesis. Check the examples document to see what it means concretely.

Collections of documents

Another main concept of the ontology is bibo:Collection. This concept has an aggregation inherent property. Its only purpose is to aggregate bibo:Document(s). An entity of this class will have a role of hubs into the RDF graph (network) created out of bibliographic relations (properties).

Other types of collections, with some restrictions of their own, have also been created. These other collections, such as bibo:CourtReporter are intended to be anchor points that can be extended by Bibliographic Ontology Extension Modules of particular specialized sub-domains such as Law documents.

There is the current list of specialized collections:

  • InternetSite
  • Series
  • Periodical
    • Journal
    • Magazine
    • CourtReporter

Part of Documents

Another important concept is bibo:Part. This concept, along with locators (more about them in the next section), enables us to specify the components of Document. In fact, sometimes documents are aggregated to create collections, such as journals, magazines or court reporters. However, sometimes, documents are embedded within a document (embedded versus aggregated). This is the utility of bibo:Part; a bibo:Part is a document, but in fact, it’s a part of a document. The special property of a bibo:Part is dcterms:hasPart. So, a bibo:Part has use this property to relate it to the document it is part of. Check the examples document to know how bibo:Part can be used.

Locating Parts

To support the concept of Parts, a set of properties, called “locators” have been created. These locator properties will help to describe the relation between a Part and its related Document.

Three of these locators are bibo:volume, bibo:chapter and bibo:page. So, these properties will locate Parts inside documents. For example: a chapter within a book, or a volumne within a document that is a set of volumes.

Check the example about the document “The Art of Computer Programming” by Donald Knuth for a good example of how locators can be used.

This said, we could now think to describe a document by its parts, recursively from its volumes to its pages.

Open questions

  1. Should we develop the ontology such that we can describe the entire workflow that lead to the creation and publication (possibly) of a document? All this workflow would be supported by the FRBR principles. At the moment, all the ontology describes the manifestation of a work, and not the work itself or its expression. Take a look at The Music Ontology (its workflow) to see how it could be done for the bibliographic ontology.
  2. If the creation of classes and individuals of classes the good way to describe type of documents?
  3. Is it the good way, or is there other ways, to describe contributions of people to the elaboration of documents?

Re-used ontologies

  • DCTERMS: re-used to describe main properties of document.
  • FOAF: re-used to describe people and organizations.
  • EVENT: re-used to describe events (example: conferences)
  • TIME: re-used to describe temporal properties
  • wgs84_pos: re-used to describe geographical entities

Conclusion

Please give any feedbacks, suggestions or comments directly on the mailing list of the group that develop this ontology. This group is intended to create an ontology that would create some type of consensus between people and organization working with bibliographical data.

Note: I disabled comment on this post only, to make sure that people comment on the mailing list.