With this screencast, you will see how you can update all the pieces that compose the Open Semantic Framework (OSF) stack. You will discover all the tools that are available to you to update the different programs. You will be able to update the following applications with the latest release or development code:
I am really proud to announce the release of the Open Semantic Framework version 3.0. This is a major milestone for the OSF platform and it includes important new features and improvements.
The updated platform has just emerged from more than a year and a half of full-time development sponsored by one of Structured Dynamics‘ clients: Healthdirect Australia. OSF’s development as been highly influenced by the big enterprise requirements of the HDA sponsor, resulting in two portals to be fully operated by OSF: healthinsite and Pregnancy, Birth and Baby. OSF 3.0 is already in production with these two portals, but it will continue to constantly evolve in the coming months and years.
The OSF release is major in a number of ways. The first thing you will notice is that we re-branded the entire project, which includes all of its moving parts, around the OSF name. The OSF for Drupal (previously known as conStruct) was migrated to Drupal 7 and about 80% of its code was re-written. Seven new OSF Web Services (previously known as structWSF) were created. The old IP based security layer was completely replaced by a new key based security layer. A new revisioning system has been put in place to revision every record has it changes. A new caching layer has been added to the OSF Web Services to improve its performance and decrease the load on the other pieces of the OSF stack (about 80% of the non-search queries will hit the cache). A set of command line tools has been developed to help system administrators to manage and automate tasks on OSF instances. A set of system integration tests, which is composed of 746 tests and 4139 assertions, tests all of the functionalities of the system to make sure it is properly deployed on a server. The OSF Wiki has been completely rewritten and re-organized to help users and developers to find answers to their questions.
The OSF Web Services changed drastically since version 1.1. Most of its code got re-written, a new structure has been put in place, new features and new web service endpoints got created, etc. In this section, we will cover what changed in the OSF Web Services and what are these new features.
New Security Layer
Initially, we created a simple and effective security layer for the OSF Web Services. It was based on the IP of the requester, nothing more, nothing less. That was five years ago. This simple security layer was quite effective, but it was a nightmare to manage.
What we did for OSF 3.0 is to ditch this old security layer, and to replace it by something secure and much easier to manage.
To validate the web service call, the new security system uses a secret keys authentication system. Every HTTP query that is sent to any web service endpoint needs to comply with the security protocol. If it doesn’t, then the requests will be refused.
Then if the call is authenticated, the web service endpoint will make sure that the requesting user has proper access to the datasets that are being queried. This second authentication step makes sure that the user can only access the data to which he has access rights.
The real improvement of the new security layer is how the users are managed. In the past, we were managing individual IP addresses. Now, we are managing groups of users. All dataset access permissions to records are related to a group. Each group is composed of one or multiple users. Then, when a web service endpoint checks if a requesting user does have access to the content of a certain dataset, it checks if the requesting users belong to a group that has access to the content of that dataset.
It is now much easier to manage groups of users at the level of the dataset than individual IP addresses.
New Revisioning System
A new records revisioning system is now available in OSF. If required, every change to a record can be revisioned. This means that if someone makes an error when editing a record, all changes can be roll-backed at anytime using the new revisioning system.
A new set of web service endpoints has been created to manage the revisions. You can list, read, update, delete, and compare revisions with these new endpoints.
New Web Service Endpoints
A series of new web service endpoints have been created:
All of the web services that create, update or read data from OSF now have multi-lingual capabilities. If you are creating data, the only thing you have to do is to specify the language for each literal you are defining in the RDF documents you are indexing in OSF. If you are reading or searching data, you only have to specify the language you want to use for each web service query you are creating.
New Caching Layer
OSF is a stack that includes a multitude of underlying systems such as Virtuoso, Solr, OWLAPI, GATE, etc. Depending on the web service endpoints that are used, and depending on how they are used, the same query can be requested again and again, and each of these background services may be queried again and again too.
To improve the performances of each of the OSF Web Services, and to minimize the usage of these underlying systems as much as possible, we added a caching layer at the level of the web service endpoints. The result is that every OSF Web Services query is being cached into the caching layer. This means that every time that the same query is being requested twice, the second time the results will come from the caching layer.
The Search web service endpoint, which is by far the most used OSF web service endpoint, also improved quite a lot in developing this new version.
First, the Search endpoint is now using the eDisMax query parser. In itself, this changes everything in the endpoint and leads to the creation of multiple new search functionalities.
It is now possible to change the ranking of the search results by boosting the scoring of the results based on different things such as their dataset provenance, their types or any of their attribute/values. This enables the possibility to improve the quality of the results returned on a OSF web portal depending on the context of a search and the semantics of the records being searched.
It is also now possible to add restrictions to the search queries. This means that search keywords will be restricted to a set of attributes. Then it is also possible to boost the scoring of the returned results depending on where the search keywords appeared.
There is a new spell-checker function for the search queries. This means that if no results are returned for a specific search query, then the system will return a series of possible keywords that the user may want to use to re-initiate the search query.
Finally, an extended search query syntax is now supported by the Search endpoint. This enables more complex search queries to be sent to the Search endpoint, opening the door to the creation of more complex contextual search profiles queries.
New Interfacing Mechanism
A new interface mechanism as been put in place for the OSF Web Services. An interface is a the code that is run by the web service endpoint for a given query.
An interface cocorresponds to a specific version of a web service endpoint. Two different interfaces, for the same endpoint, may comply to different versions of its API. However, these two interfaces can work side-by-side using the same data.
If two interfaces comply to the same endpoint API, it means that their processing of the query will be different (like querying Solr 4.0 instead of 3.6). If two interfaces don’t comply to the same endpoint API version, then it means that each interface supports different versions of the endpoint.
This new interfacing mechanism becomes handy to support more than one triple store, or when the same OSF instance needs to use different Solr query parsers, or when some of the endpoints have to be backward compatible for some portals/users that still need to be supported by the OSF instance, etc.
The new interfacing mechanism gives the flexibility to be able to run different code or support different web service API version on the same OSF instance.
OSF for Drupal
OSF for Drupal now runs on Drupal 7. About 80% of the Drupal-related code got rewritten and we can now state that OSF is fully integrated into Drupal.
A series of OSF connectors have been developed in the last year and a half that basically let Drupal’s core features use OSF instead of MySQL: Entity & Entity API, FieldAPI & FieldStorage and the SearchAPI. These connectors mean that if OSF for Drupal is installed and configured on a Drupal 7 instance, developers will be able to use these core APIs to query registered decentralized OSF instances instead of local MySQL/Solr instances.
The OSF Entities connector module implements the Drupal Entity API. This means that if OSF for Drupal is properly installed and configured on a Drupal instance, that the Entity API can be used to read, create, update and delete content from registered external OSF Web Services networks. Under this scenario, no information about these Drupal entities will be local to the Drupal instance. All of the content will be hosted externally on a dedicated OSF instance. All of the data manipulated by the Entity API is RDF data. What that means is that the Entity API now may interface with an RDF data management system, with communications with it via web service endpoint queries.
In short, this connector makes OSF records visible to Drupal via the Entity API.
The OSF FieldStorage connector module creates a new FieldStorage type that enables Drupal users to save Drupal content into an OSF instance instead of saving the content in the default storage system (namely MySQL). This means that if someone starts using OSF instead as the backend of a Drupal portal, then all the Drupal content that will be created will be available via the OSF web service endpoints. This means that other external applications that know how to talk to OSF web service endpoints are now able to leverage the content that has been created from the Drupal instance. Also, all of the content will be available as RDF.
What this connector does at the end is to save Drupal entities into OSF instead of in the default storage system (MySQL).
The OSF SearchAPI connector module creates a new service for the SearchAPI module. It enables the SearchAPI to send search queries to an OSF Search web service endpoint instead of the default search service. This means that the Drupal search engine is now fully powered by the OSF Search endpoint, and gives access to all the datasets hosted on one, or multiple, remote OSF instances.
Better Configuration & Management
Registering, configuring and managing OSF instances and datasets into Drupal has never been easier. The new OSF Configure module is a new module that centralizes all of the features and options that are required to register, configure and maintain OSF instances and datasets.
QueryBuilder & Search Profiles
A new kind of tool has been developed in OSF for Drupal 3.x: OSF Search Profiles. A search profile is a predefined search query where its search results are displayed in a block positioned on some Drupal pages. These search profiles are normally used to display lists of information that match a search query. Search profiles are also to some extent aware of their context. For example, if the main topic of a page is about cancer and if we have a search profile that displays a list of events, then when the search profile is used in the context of that page about cancer, then cancer related events should be displayed. That is one of the core purposes of the search profiles.
The search profiles’ underlying search queries are being created using the new OSF Query Builder module. This powerful user interface enables site administrators to create complex search queries that will be used within a search profile.
OSF Web Services PHP API
In prior versions, knowing how to query the OSF web services was not an easy task. It is the reason why the OSF Web Services PHP API was developed: to help developers to easily query OSF web service endpoints. This PHP API is a set of classes where each of them has a series of methods that can be used to query a particular web service endpoint. Let’s take this example of some OSF WS PHP API code that does send a query to the OSF Search web service endpoint:
[cc lang=’php’ line_numbers=’false’]
// Step #1: Instantiate the class of the web service then want to query
// Create the SearchQuery object
$search = new SearchQuery(‘http://localhost/ws/’, ‘some-app-id’, ‘some-api-key’, ‘http://localhost/users/foo’);
// Step #2: Define all the parameters/features/behaviors of the web service by invoking different methods of the class
// Print the PHP array serialization for that resultset
OSF Management Tools
A new set of command line tools have been developed for OSF version 3.0. These tools’ focus has been to help OSF instance administrators by giving them command line tools that they could use in their scripts, Cron jobs, or any other middleware toolings that may perform different tasks on a OSF instance.
Datasets Management Tool
The Datasets Management Tool (DMT) is a command line tool used to manage datasets of a OSF instance. With this tool, you may create, delete, update, import and export datasets directly from the command line.
Ontologies Management Tool
The Ontologies Management Tool (OMT) is a command line tool used to manage ontologies of a OSF Web Services network instance. It can be used to list the ontologies of a OSF Web Services instance, to manage those ontologies, to create/import new ones, to delete existing ones, and to generate underlying ontological structures.
The Data Validator Tool (DVT) is a command line tool used to perform a series of post-indexation data validation tests. What this tool does is to run a series of pre-configured tests, and return validation errors if any are found.
All the OSF Widgets (formerly the Semantic Components) have been updated to work with OSF 3.0. The big difference with this update is that all of the OSF Widgets now have access to an OSF for Drupal proxy. This proxy enables them to communicate with a OSF Web Services instance without having to authenticate themselves to the endpoints.
The OSF Wiki has been completely rewritten and re-organized. It is the go-to place to find more information about the Open Semantic Framework project, and all pieces of the stack.
Installing and Configuring OSF
Installing and configuring OSF has never been easier to do. The OSF Installer utility has been improved to ease the deployment of OSF on a new Ubuntu 12.10 server. The installation tool will install and configure all the pieces required by the OSF stack. Once everything is installed and configured, it will run the OSF Tests Suites to make sure that all the OSF functionalities are fully operational on the new server.
Then, once the OSF stack is installed, the user is then able to use the OSF Installer tool to install, deploy and configure Drupal 7 with OSF for Drupal.
Additionally, we created a new public Amazon AWS EC2 image that includes the full OSF stack version 3.0. This new public image is available in all the zones:
A complete suite of integration tests has been created for OSF 3.0. The tests suites are composed of 746 tests and 4139 assertions. These integration tests make sure that all of the functionality of an OSF instance is working. These tests are run every time an OSF instance is deployed using the OSF Installer script. Then, they can be re-run anytime thereafter. Normally, every time an update is made on an OSF instance, the tests should be run as well to make sure that the update didn’t break anything.
These tests are testing:
All of the input parameters of each endpoint
All of the combinations of all the input parameters of each endpoint
All of the mime types supported by each endpoint
All of the expected error returned by each endpoint.
We have been working on this new Open Semantic Framework version 3.0 for almost two full years now. We have been quiet during that time since we had no more time other than coding, documenting, testing and deploying the code that we are releasing today.
This new version is a major leap forward for the Open Semantic Framework open source project. Five years ago, Mike and I set as a goal to have a complete OSF stack in place that could be leveraged by anybody to fulfill the requirement of any kind of projects. I think that with this OSF version 3.0, we reached the middle term goal that we fixed for ourselves 5 years ago.
I am proud to announce the new NOW (Neighbourhoods Of Winnipeg) semantic web portal! This new and innovative semantic web portal was publicly announced by the Mayor of Winnipeg City last week.
The NOW (Neighbourhoods of Winnipeg) portal is “a new Web portal (the “Portal”) produced by the City of Winnipeg to provide broad, dynamic and interactive access to local and neighbourhood information. Designed for easy access and use by all citizens, businesses, community organizations and Governments, the information on the site includes municipal data, census and demographic information, economic development information, historical data, much spatial and mapping information, and facilities for including and sharing data by external groups and constituencies.”
This project has been the springboard that led to the Open Semantic Framework version 1.1. Multiple pieces of the framework have been developed in relation to this project, and more particularly pieces like the sWebMap semantic component and several improvements to the structWSF web services endpoints and conStruct modules for Drupal 6.
Development of the Portal
The development plan of this portal is composed of four major areas:
Development of the data structure of the municipal domain by creating a series of ontologies
Conversion of existing data asset using this new data structure
Creation of the web portal by creating its design and by developing all the display templates
Creation of new tools to let users interact with the data available on the portal
Structured Dynamics has been involved in #1, #2 and #4 by providing design and development resources, technology transfer sessions and material and supporting internal teams to create, maintain and deploy their 57 publicly available datasets.
The Data Structure
This technology stack does not have any meaning without the proper data and data structures (ontologies) in place. This gold mine of information is what drives the functionality of the portal.
The portal is driven by 12 ontologies: 2 internal and 10 external. The content of the 57 publicly available datasets is defined by the classes and properties defined in one of these ontologies.
The two internal ontologies have been created jointly by Structured Dynamics and the City of Winnipeg, but they are extended and maintained by the city only.
These ontologies are maintained using two different kind of tools:
Protege is used for the big development tasks such as creating a big number of classes and properties, to do a big reorganization of the classes structure, etc.
structOntology is used for quick ontological changes to have an immediate impact on the behaviors of the portals such as label changes, SCO ontology property assignments to change the behavior of some of the tools that exist in the portal, etc.
structOntology can also be used by portal users to understand the underlying data structure used to define the data available on the portal. All users have access to the reading mode of the tool which let them browse, search and export the loaded ontologies on the portal.
Except for rare exceptions such as the historical photos, no new data has been created by the City of Winnipeg to populate this NOW portal. Most of its content comes from existing internal sources of data such as:
Conventional relational databases
GIS (Geographic Information System) on-top of relational databases
All of the conventional relation databases and legacy data from the GIS systems has been converted into RDF using the FME WorkbenchETL system. All of the FME workbench templates are mapping the relational data into RDF using the ontologies loaded into the portal. All of the geolocated records that exist in the portal come from this ETL process and have been converted using FME.
Some smaller datasets come from internal spreadsheets that got modified to comply with the commON spreadsheet format that is used to convert spreadsheet (CSV/TSV) data files into RDF.
All of the dataset creation and maintenance is managed internally by the City of Winnipeg using one of these two data conversion and importation processes.
Here are some internal statistics of the content that is currently accessible on the NOW portal.
These are statistics related to different functionalities of the portal.
Number of neighbourhoods: 236
Number of community areas: 14
Number of wards: 15
Number of neighbourhood clusters: 23
Number of major site sections: 7
Total number of site pages: 428,019
Static pages: 2,245
Record-oriented pages: 425,874
Dynamic (search-based) pages: infinite
Number of documents: 1,017
Number of images: 2,683
Number of search facets: 1,392
Number of display templates: 54
Number of links: 1,067
External links: 784
Internal links: 283
These statistics show the things that are available via the portal, what are their types, their properties, what is the quantity of data that is searchable, manipulable and exportable from the portal.
Number of datasets: 57
Number of records: 425,874
Number of geolocational records: 418,869
Point of interest (POI) records: 193,272
Polygon records: 218,602
Path (route) records: 6,995
Number of classes (types): 84
Number of properties: 1,308
Number of triple assertions: 8,683,103
An important aspect of this portal is that all of the content is contextually available, in different formats, to all of the users of the portal. Whether you are browsing content within datasets, searching for specific pieces of content, or looking at a specific record page, you always have the possibility to get your hands on the content that is being displayed to you, the user, with a choice of five different data formats:
All content pages can be exported in one of the formats outlined above. In the bottom right corner of these pages you will see a Export button that you can click to get the content of that page in one of these formats.
Export Search Content
Every time you do a search on the portal, you can export the results of that search in one of the formats outlined above. You can do that by selecting the Export tab, and by selecting one of the formats you want to use for exporting the data.
Users also have the possibility to export census data, from the census section of the portal, in spreadsheets. They only have to select the Tables tab, and then to click the Export Spreadsheet button.
The export functionality would not be complete without the ability to consult and export the ontologies that are used to describe the content exposed by the portal. These ontologies can be read from the ontologies reader user interface, or can be exported from the portal to be read by external ontologies management tools such as Protege.
The portal is using Drupal 6 as its CMS (Content Management System). The Drupal 6 instance communicates with structWSF using the conStruct module, which acts as a bridge between a Druapal portal and a structWSF web service network.
Here are the main design phases that have been required to create the portal:
Creation of the portal’s design, and the Drupal 6 theme that implements it
Creation of the Search and Browse results templates
Creation of the individual records’ page design and templates based on their type
Creation of the sWebMap search results templates.
The portal’s design has been created internally by the City of Winnipeg and by Tactica based on the Citizen DAN demo. Tactica also worked on another Citizen DAN like portal called MyPeg.ca.
Since the NOW portal wanted to re-use as much as possible to lower the development cost related to the portal, they choose to use the complete OSF stack which includes these Semantic Components.
The new NOW semantic web portal’s main asset is its data: how it can be searched (with traditional search engines or using a semantic component to search, browse, filter and localize results), displayed and exported. This portal has been developed using a completely free and open source semantic platform that has been developed from previous projects that open sourced their code.
I consider this portal a pioneer in the way municipal organization will provide new online services to their citizens and to the commercial enterprises based on the quality of the data that will be exposed via such Web portals.
The sWebMap is a rich mapping tool that can easily be integrated on any webpage, and that can be extensively customized. The sWebMap does support these features:
Full text search for searching and displaying results on a map
Extensive filtering capabilities
Filtering by dataset source
Filtering by type
Filtering by attribute/value
Filtering of records that belongs to a specific geographic region
Display of record on the map using:
Different markers depending on the type of record to display (determined by the ontologies)
Polygon shapes for records that refers to a geographic region
Polyline shapes for records that refers to a geographically-located path
Templating of records in a resultset depending on their type
Templating of records’ preview, displayed in an overlay window, depending on their type
Persist records on the map accros searches and filtering operations
Supports map sessions
Save map sessions
Load saved map sessions
Delete saved map sessions
Share saved map sessions
Supports a multiple-maps mode
Three focus maps are available under the main map
Each map focus on a particular region of the main map
User can switch between focus map to see different records in different region
Depending on the options you had specified when you created the sWebMap control, each time you move (option), zoom (option) or change the filtering criterias, this will send a query to the Search endpoint. The sWebMap control then requests JSON formatted resultset and display the results to the user.
This means that to implement the sWebMap component on your website, you will need to have:
You can immediately download the entire code source from this GitHub reposiroty:
Additionally, you can initialize the sWebMap component with one of the multiple options available.
Ontologies are to the Open Semantic Framework what humans were to the Mechanical Turk. The hidden human in the Mechanical Turk was orchestrating all and every chess move. However, to the observers, the automated chess machine was looking just like it: a new kind of intelligent machine. We were in 1770.
Ontologies plays exactly the same role for the Open Semantic Framework (OSF): they orchestrate all and every moves for all the pieces within OSF. They are what instructs structWSF, the Semantic Components, conStruct, and all other derivate pieces of user interfaces how to behave.
In this (lengthy) blog post, I will present the main ontologies that have an impact on different parts of OSF. We will see how different ontology classes and properties, and how the description of the records indexed in the system, can impact the behaviors of OSF.
In addition to this post, Mike has also published a blog post today that overviews the overall OSF ontology modularization and architecture.