Volkswagen UK’s Search Engine Powered by structWSF

It is now official, Volkswagen UK‘s search engine is now powered by structWSF. Their new contextual search engine has been released last Friday. I covered the underlying architecture in one of my recent blog post: Volkswagen’s RDF Data Management Workflow.

 

 

John Streit, head of technology at Tribal DDB, described the two key advantages of using the structWSF (part of the Open Semantic Framework (OSF)) for their website in an interview with Wired UK:

The first is that it gives you a single place to access data. Streit explains: “Applications often need to retrieve data from multiple sources which adds complexity and development time. By using this technology we can get everything we need from a single place which drastically lowers development time and running costs.” Furthermore the exposure of data improves search and means that it can be repurposed in new and imaginative ways.

The Open Semantic Framework Installer

We are excited to introduce the first Open Semantic Framework installation script. This new installer application will install and configure the entire Open Semantic Framework stack for you. It will take about 10 minutes of your time, and will process in the background for a few hours while everything necessary to build the OSF stack is downloaded and compiled. Open Semantic Framework Installer

The only thing you have to do to run the OSF Installer is to issue the few commands outlined below, and then to answer a few questions in the process (which, since most of them use the standard default values, is pretty easy).

The OSF Installer is a major addition to the Open Semantic Framework since it now enables a greater number of people (mere mortals) to install and use the stack, and it enables much faster deployment of the system.

The full installation manual, where each of the steps performed by the installer is explained in detail, is available as a reference here.

Requirements

The current version of the Open Semantic Framework Installer is fully operational on:

  1. Ubuntu 10.04 (Lucid)
  2. 32 Bits Operating System
  3. Access to internet from the server
  4. 5GIG of disk space on the partition where you are installing OSF

Eventually this installer will be upgraded for 64-bits operating systems, and for other Linux distributions. Also, the current installer should work on newer versions of Ubuntu, but it has only been tested to date on the latest LTS version.

Installing the Open Semantic Framework

The only manual steps need to do to install the Open Semantic Framework are to:

  1. Create a folder where to install OSF on your server
  2. Download the osf-install.zip installation package
  3. Make the osf-install.sh installation script executable
  4. Run the osf-install.sh installation script
  5. Answer the questions asked by the installer

Here are the commands you have to run:

[cc lang=’bash’ line_numbers=’true’ ]

cd /mnt/
sudo wget https://github.com/downloads/structureddynamics/Open-Semantic-Framework-Installer/osf-installer-v1.0a4.zip
sudo unzip osf-installer-v1.0a4.zip
cd `ls -d structureddynamics*/`
sudo chmod 755 osf-install.sh
./osf-install.sh

[/cc]

conStruct and structWSF Upgrades

In the process, both conStruct and structWSF have been enhanced to enable automatic upgrading in the future. Starting with structWSF version 1.0a92 and conStruct version 6.x-1.0-beta9, future upgrades should be done automatically using automatic upgrading procedures.

However, to enable this, existing users will have to upgrade their current versions manually to establish the new automatic upgrades baseline.

Next Steps

Once you have installed the OSF stack, you next query the structWSF Web service endpoints, and import datasets using conStruct. Here are a few things you can do to start exploring the Open Semantic Framework:

  1. Start exploring structWSF
  2. Start exploring conStruct
  3. Start exploring Ontologies usage in OSF
  4. Start importing and manipulating datasets
  5. Start exploring the Open Semantic Framework architecture
  6. Start playing with the structWSF web service endpoints

Since everything is installed on your server, so you only have to play with the stack now. If you break something, just ping us on the mailing list or re-install it without worrying about each installation steps!

Help

It may be possible that you experience some issues with this new OSF Installer. If that is the case, I would suggest your to make an outreach to the Open Semantic Web Mailing List so that we fix it on the Git repository.

Just write an email that includes the specifications of the server where you are trying to install OSF on. Then tell us where the issue happens in the installation process. Also add any logs that could be helpful in debugging the issue.

Conclusion

This is the first version of the OSF installer, but this is a real balm for installing OSF. As noted, this installer will eventually be upgraded to support 64-bit servers and other Linux distributions. Also, any help improving this installer from Bash wizards would naturally be greatly welcomed.

Open Sources Projects As A Pool Of Resources

In a previous blog post, I wrote about how Open Source may be unnatural, and even counter intuitive, to many people. However, that really begs some questions evident with my current company’s strategy.

Why have Mike Bergman and I chosen to develop no less than three major open source projects (structWSF, conStruct and the Semantic Components), encompassing more than 100 000 lines of new code and leveraging between 30 to 50 other open source software and libraries? Why have we open sourced all our software? Why has open source formed the core business strategy of Structured Dynamics in the last three years? How have we been able to profitably sustain the company, even in the midst of the global economic crisis that began in 2008?

I will try to answer these questions in this blog post, perhaps even providing some guidance for newer startups that may follow behind us.

Why Open Sourcing?

Why did Structured Dynamics chose to open source all of its software? There are multiple reasons why people and businesses choose to go open source. For some, it is because they think that it is where the market place is moving. For others it is because they think that a community will emerge around their effort, and then get free resources that improve the piece of software. Some think that their software will promptly be reviewed by professional programmer. Others may think that their system will become more secure. Etc.

For Structured Dynamics the reason why we choose to go open source is somewhat different:

We perceived that by open sourcing our complete software stack we could bootstrap the company without any external investment.

Making a Living out of Open Source Projects

There are multiple ways to do a living from an open source project:

  • Doing consultancy work related to the project
  • Implementing the software(s) into clients’ computer environment(s)
  • Selling training classes
  • Selling support contracts
  • Selling maintenance contracts
  • Selling hosted instances of the software (the SaaS model for one)
  • Selling development time to improve some part(s) of the software
  • Creating conferences around their open source projects
  • Selling proprietary extensions
  • I am probably missing a few, so please add them in a comment section below, and I will make sure to add them to this list.

Depending on the software you are developing, and depending on the business plan of your company, you may be doing one — or multiple — of these things to generate some money from your open source projects.

At Structured Dynamics we are doing some of them: we do get consultancy contracts related to the Open Semantic Framework and we do implement OSF in our clients’ computer environments.

But, more importantly, we are also doing development contracts related to the framework. In fact, each project we are working on is quite different. Our major projects involve companies that reside in totally different domains, have different needs and need to accommodate different kinds of users. However, most of the projects share the same core needs, and all of them advance the core technology in ways meaningful to our vision. We choose our customers — and , of course, vice versa — based on a true sense of partnership wherein both parties have their objectives furthered.

Let’s see how we use these relationships to drive the development of the Open Semantic Framework.

Open Source Project as a Pool of Resources

In the last three years, Structured Dynamics has attracted multiple companies and organizations that share our vision, and which are willing to invest in the Open Semantic Framework open source project. (See Mike’s recent post on business development for a bit more on that aspect of things.) Each of these clients did want to use the OSF framework for their own needs. However, each of them did want to do something special that was not currently implemented in the framework.

What we created in these three years is a pool of resources that we used to develop the framework such that it accommodates the needs of each of our clients. Each of our clients then becomes a participant to the shared pool of innovation. Our clients have been willing to invest in the open source framework because they need their own features and because they know that they will benefit from what other participants of the pool will invest themselves down the road.

In that scenario, we are the managers of a pool of resources. We have the vision of where we want the framework to go, we know the roadmap of the project and we know the needs of each participant (our clients). What we do is to try to optimize the resources we get from each of our clients by developing the framework such that it can accommodate as broad of a spectrum of participants as possible. Then, we seek to find new participants that have some needs that will help us continue to develop the next steps of the roadmap. In this manner, we Jacob’s Ladder our existing work to increase the capabilities for later clients, but earlier clients still benefit because they can upgrade to the later improvements. This is a self-sustaining model to continue to move the development of the framework forward.

By finding new clients, what we do is to give a return on investment to the other pool participants. Most of the new features that we develop for these new clients will benefit the other participants to the pool and will create new possibilities for them without any additional investment. All of our first clients have implemented what other participants later invest into the pool, thus crystallizing and augmenting their return on investment by using these new features.

Open Source is Not Just About Software

Open Source is not just about pieces of code, and this is quite important to understand. What we have open sourced with the Open Semantic Framework is much more than a series of code sources. We open sourced the entire framework:

  1. The source codes
  2. The documentation
  3. The processes
  4. The methodologies

We term this comprehensive approach our total open solution.

This distinction with other open source projects is an essential differentiator with our approach. We choose to open source all of the pieces related to the framework. What drove this decision is a simple sentence that shows our philosophy behind it:

“We’re Successful When We’re Not Needed”

If the APIs, processes and methodologies are not properly documented, it means that we would certainly be needed by our clients, which would mean that we failed to open source our solution. But since we are working to open source our code, our processes and our methodologies, we are on the way to successfully open source the Open Semantic Framework since we won’t be needed by our clients.

This business approach is not as crazy as it sounds. We are free to work on new and important innovations, and are not basing our company culture on dependency and a constant drain by our customers. I know, it does not sound like Larry Ellison, but sounds good to us and our clients. It is certainly not a maximum revenue objective built on the backs of individual clients.

Our life is more fun and our clients trust us with new stuff. Further, each step of the way, we are able to leverage our own framework for unbelievable productivity in what we deliver for the money. But that is a topic for another day.

We think Structured Dynamics’ business approach is a contemporary winning strategy. Our customers get good and advanced capabilities at low cost and risk, while we get to work on innovative extensions that are raising the semantic baseline for the marketplace. Who knows if we will always continue this path, but for now it is leading to sustained development and market growth for open semantic frameworks, including our own OSF.

 

 

Volkswagen’s RDF Data Management Workflow

TribalDDB UK’s team just published a new case study to the W3C: Case Study: Contextual Search for Volkswagen and the Automotive Industry. They discuss the benefits of some of the semantic web technologies, techniques and concepts that they use to help them managing their data. They describe their approach and outline their design. It covers the technical aspects of their new Semantic Web Platform that I wrote about a few weeks ago.

In this blog post, I want to further explain their data management workflow, and how their data get exposed to different kind of users.

Two Classes of Users

Let’s take a look at their data ingest/management/publishing workflow:

As you can see, all their data get collected, transformed and imported into structWSF. As I explained in my previous blog post, they are using structWSF to manage all their RDF data and access all the functionalities from the different web service endpoints.

However, how the data get exposed to the users is not that clear. In fact, it depends on the classes of users. A user can be multiple different things: it may be a person, it may be a computer software, it may be an organization, etc. However, there are two general classes of users:

  1. Public users, and
  2. Private users

Public users are users that have no direct relation with Volkswagen and that have no access to their internal network. Private users are generally internal departments or some internal software applications that have direct access to the structWSF instance.

Private Users

Private users generally have access to all structWSF web service endpoints. This means that all structWSF functionalities are accessible to them by querying the endpoints.

Two different kind of private users are specified in the use case’s schema:

  1. Volkswagen Site Search
  2. Other / External Applications

The Volkswagen site search is a software application that uses the structWSF Search endpoint to search, filter and expose their data to their users (the people who perform searches on the Volkswagen UK website).

The other/external applications are software applications that have access to the structWSF instance. These are generally internal applications that run in the same network. One of these applications is an internal software that exports all the RDF data from the structWSF SPARQL endpoint, and import it into Kasabi.

These are two examples of software applications that Volkswagen created around the structWSF web services to re-purpose, re-contextualize and re-publish their RDF data.

Public Users

There is currently two kinds of public users of this new Volkswagen Semantic Platform:

  • People, and
  • Software applications

Two interfaces have been made publicly available for each of these kinds of users:

  • A website search engine page for people, and
  • A SPARQL endpoint for software applications

When a person user reaches the website’s search page, the search query get sent to the structWSF Search web service endpoint. The result is then returned to the engine, get templated and displayed to the person user.

A SPARQL endpoint is accessible to the software applications. This endpoint is hosted by the Kasabi information marketplace. Volkswagen chooses to export everything from their structWSF into Kasabi to outsource the maintenance of their public SPARQL endpoint.

Unlock the Power

As we saw in this blog post and in the W3C use case, all Volkswagen UK data is internally managed by structWSF; however they are not locked into that system. They can easily communicate with external services to add new functionalities to their stack or to take business decision such as outsourcing the management of some publicly accessible data access endpoints.

This is an important characteristic of their design:

By choosing semantic web technologies (such as structWSF), techniques and concepts (such as their Vehicles OWL Ontology and RDF), they are not locking themselves into a specific framework. They can easily communicate with external systems and applications. This means that they can quickly adapt their system to their constantly changing needs.

Conclusion

I wrote this blog post to further explain Volkswagen’s data management workflow. I wanted to make sure that people were understanding the role that structWSF has in this use case, and the ecosystem it operates in.

Volkswagen’s Use of structWSF in their Semantic Web Platform

TribalDDB London, Volkswagen UK‘s partner, mentioned earlier this week that Volkswagen are using some parts of the Open Semantic Framework to develop the next generation of their online platform.

This story has been published by Jennifer Zaino’s in her article: Volkswagen: Das Auto Company is Das Semantic Web Company!

I can now talk about this project that uses some pieces of the framework that we have been developing for more than 3 years now.

The Objective

Volkswagen’s main objective behind the development of the next version of their Web platform started by improving their online search engine, but as William Greenly mentioned, it quickly became a strategic decision:

“So the objectives were about site search and improving it, but in the long-run it was always the idea to contextualize content, to facet content, to promote it in different contexts.”

The objective is to create a platform that gives them the flexibility to leverage all the data assets they own. This flexibility will help them to leverage the data assests they have to improve not only their search engine, but also to contextualize it in different parts of their websites, partner’s websites or to promote, and publish that same information on different communication channels or devices.

The Flexibility

What is a flexible platform in that context? A flexible platform is one that can integrate any kind of information sources. Such information sources in the context of Volkswagen can be a series of relational dataset schemas spread around the World, Excel spreadsheets, CSV files, old plain text technical documents about past model of cars, semi-structured documents such as webpages, etc.

A flexible platform is also one that minimally impact (if at all) the data consumers if the data structure changes in the system. This is really important since the World we live in constantly changes. This means that things constantly change and we have to reflect these changes in the data we own and maintain. This is why this point is so important, because we want to minimize the impact of the data structure changes that will happen all the time.

Having the flexibility to constantly adapt your data, while minimally impacting the data consumers of the system, enables you to make quick decision to adapt your strategy in a highly competitive World. This flexibility gives you a clear business advantage.

A flexible platform is also one that let you publish your data the way you want, in the format that is needed. Such a flexible platform has to give you access to an interface that give you access to all the functionalities of the platform without having to care about what happens under the hood.

A flexible system is one that can communicate your information on any kind of communication channels, and to any devices that have access to the Web.

Under the Hood

That next generation platform that Volkswagen is currently developing is partly based on a few of the main pieces of the Open Semantic Framework. These pieces help them to reach their goal by helping them giving the flexibility their platform needs.

The first step they gone thru was to create their Volkswagen Vehicles Ontology that is used to describe all the entities they want to index into their platform. The Web Ontology Language (OWL), along with the Resource Description Framework (RDF) is what gives them the complete flexibility on how they can integrate all the pieces of information they want, in a canonical format.

Then they choose to use structWSF (the structured data web services framework). This piece gives them the flexibility to get a series of web interfaces (web service endpoints) to create, update, manage and query their data. This web service layer enables them to do anything they want with their data, from anywhere on the Web. This is possible because all the functionalities of the framework are exposed as web service endpoints. StructWSF also gives them the possibility to communicate their data in multiple different formats. This makes it the perfect flexible system to feed their information in different contexts, in different communication channels or on different devices.

At Volkswagen, structWSF is used to populate, and keep in sync, their Solr and Triple Store instances. It gives them the time to care about the more important aspects of their platform, and to care about how the data should be synced between the various specialized data management systems.

By using structWSF to manage their data, they are able to reach some objectives to make their platform as flexible as possible:

  • To be able to minimize the impact of data changes to the data consumers
    • Because structWSF uses OWL & RDF to describe all the data it index
  • To be able to manipulate their data from anywhere
    • Because all the functionalities of structWSF are exposed as web service endpoints
  • To be able to communicate the information in different contexts, communication channels and devices
    • Because structWSF has, in its core, is designed to transform all the data it indexes in any other kind of format

The Next Step

One of their longer term goal and objective is to analyze their unstructured and semi-structured textual documents to extract some structure out of them, and to index them into their semantic platform. To do this, they are looking at using Scones, which is the structWSF semantic tagger web service endpoint. Scones will use some subject reference structures such as UMBEL to semantically tag the textual document. Once the document as been processed by Scones, and indexed in structWSF, it can now be re-published in different contexts based on the reference concepts that have been tagged to it. This gives them the flexibility to leverage non-structured sources of data and to re-purpose it in different ways by publishing it in different context and in different systems.

This second system will enable them to leverage the investment they made in the past, by writing all these textual documents, and to re-purpose, and re-contextualizing, them in all kind of different contexts.

Conclusion

I think that TribalDDB and Volkswagen make the good decision for their future. Taking the business decision to develop and maintain a completely new kind of information system is not an easy decision to take. I am not saying that they made the good choice to use our pieces of the stack. The decision goes far beyond this. Such a Semantic Platform challenges everything in an organization: the people that takes the decisions, the people that create and manage the data, the people that develop the system, the people that maintain that system, the consumers of the system, the customers, the partners, etc. This is a big decision; whatever the technology stack you plan to use. I congratulate them for the decision they took.

I strongly believe that this was the right decision to take considering the future opportunities they are creating to themselves.