3.5 Million DBpedia Entities in Drupal 7

In the previous article Loading DBpedia into the Open Semantic Framework, we explained how we could load the 3.5 million DBpedia entities into a Open Semantic Framework instance. In this article, we will show how these million of entities can be used in Drupal for searching, browsing, mapping and templating these DBpedia entities.

Installing and Configuring OSF for Drupal

This article doesn’t cover how OSF for Drupal can be installed and configured. If you want to properly install and configure OSF for Drupal, you should install it using the OSF Installer by running this command:

  ./osf-installer --install-osf-drupal

Then you should configure it using the first section of the OSF for Drupal user manual.

Once this is done, the only thing you will have to do is to register the OSF instance that hosts the DBpedia dataset. Then to register the DBpedia data into the Drupal instance. The only thing you will have to do is to make sure that the Drupal’s administator role has access to the DBpedia dataset. It can be done by using the PMT (Permissions Management Tool) by running the following command:

  pmt --create-access --access-dataset="http://dbpedia.org" --access-group="http://YOU-DRUPAL-DOMAIN/role/3/administrator" --access-perm-create="true" --access-perm-read="true" --access-perm-delete="true" --access-perm-update="true" --access-all-ws

Searching Entities using the Search API

All the DBpedia entities are searchable via the SearchAPI. This is possible because of the OSF SearchAPI connector module that interface the SearchAPI with OSF.

Here is an example of such a SearchAPI search query. Each of these result come from the OSF Search endpoint. Each of the result is templated using the generic search result template, or other entity type search templates.

What is interesting is that depending on the type of the entity to display in the results, its display can be different. So instead of having a endless list of results with titles and descriptions, we can have different displays depending on the type of the record, and the information we have about that record.

dbpedia_search_3

In this example, only the generic search template got used to display these results. Here is the generic search results template code:

Manipulating Entities using the Entity API

The Entity API is a powerful Drupal API that let developers and designers loading and manipulating entities that are indexed in the data store (in this case, OSF). The full Entity API is operational on the DBpedia entities because of the OSF Entities connector module.

As you can see in the template above (and in the other templates to follow), we can easily use the Entity API to load DBpedia entities. In these templates examples, what we are doing is to use this API to load the entities referenced by an entity. In this case, we do this to get their labels. Once we loaded the entity, we end-up with an Entity object that we can use like any other Drupal entities:

Mapping Entities using the sWebMap OSF Widget

Because a big number of DBpedia entities does have geolocation data, we wanted to test the sWebMap OSF Widget to be able to search, browse and locate all the geolocalized entities. What we did is to create a new Content Type. Then we created a new template for that content type that implements the sWebMap widget. The simple template we created for this purpose is available here:

Then, once we load a page of that Content Type, we can see the sWebMap widget populated with the geolocalized DBpedia entities. In the example below, we see the top 20 records in that region (USA):

dbpeida_swebmap_2

Then what we do is to filter these entities by type and attribute/values. In the following example, we filtered by RadioStation, and then we are selecting a filter to define the type of radio station we are looking for:

dbpeida_swebmap_3

Finally we add even more filtering options to drill-down the geolocalized information we are looking for.

dbpeida_swebmap_4

We end-up with all the classical radio station that broadcast in the region of Pittsburgh.

dbpeida_swebmap_5

Templating Entities using Drupal’s Templating Engine

Another thing we get out of the box with Drupal and OSF for Drupal, is the possibility to template the entities view pages and the search resultsets. In any case, the selection of the template is done depending on the type of the entity to display.

With OSF for Drupal, we created a template selection mechanism that uses the ontologies’ structure to select the proper templates. For example, if we have a Broadcaster template, then it could be used to template information about a RadioStation or a TelevisionStation, even if these templates are not existing.

Here is an example of a search resultset that displays information about different type of entities:

dbpedia_search_2

The first entity is an organization that has an image. It uses the generic template. The second one is a person which also use the generic template, but it has no image. Both are using the generic template because none of the Organization nor the Person templates have been created. However, the third result uses a different template. The third result is a RadioStation. However, it uses the Broadcaster template since the RadioStation class is a sub-class-of Broadcaster and because the Broadcaster template exists in the Drupal instance.

Here is the code of the Broadcaster search result template:

Now let’s take a look at the template that displays information about a specific Entity type:

dbpedia_entity_view

This minimal records displays some information about this radio station. The code of this template is:

Building Complex Search Queries using the OSF Query Builder

A system administrator can also use the OSF Query Builder to create more complex search queries. In the following query, we are doing a search for the keyword “radio“, we are filtering by type RadioStation, and we are boosting the scoring value of all the results that have the word “life” in their slogan.

dbpeida_querybuilder_1

The top result is a radio station of Moscow that has “Life in Motion!” as its slogan. We can also see the impact of the scoring booster on the score of that result.

Conclusion

As we can see with these two articles, it is relatively easy and fast to import the DBpedia dataset into a OSF instance. By doing so, we end-up with a series of tools to access, manage and publish this information. Then we can leverage the OSF platform to create all kind of web portals or other web services. All the tools are there, out-of-the-box.

This being said, this is not where lies the challenge. The thing is that there is more than 500 classes and 2000 properties that describes all the content present in the DBpedia Ontology. This means that more than 2000 filters may exists for the Search API, the sWebMap widget, etc. This also means that more than 500 Drupal bundles can be created with hundred of fields, etc.

All this need to be properly configured and managed by the Drupal site developer. However, there are mechanisms that have been developed to help them managing this amount of information such as the entity template selection mechanism that uses the ontologies’ structure to select the display templates to use. For example, you could focus on the entity Broadcaster, and create a single template for it. Automatically, this template could be used by sub-classes such as BroadcastNetwork, RadioStation, TelevisionStation and many others.

The Open Semantic Framework is really flexible and powerful as you may have noticed with this series of two articles. However, the challenge and most of the work lies into creating and configuring the portal that will use this information. The work lies into creating the search and entities templates. To properly define and manage the bundles and fields, etc.

0 Responses to “3.5 Million DBpedia Entities in Drupal 7”


  1. No Comments

Leave a Reply






This blog is a regularly updated collection of my thoughts, tips, tricks and ideas about data mining, data integration, data publishing, the semantic Web, my researches and other related software development.


RSS Twitter LinkedIN


Follow

Get every new post on this blog delivered to your Inbox.

Join 71 other followers:

Or subscribe to the RSS feed by clicking on the counter:




RSS Twitter LinkedIN