<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Frederick Giasson's Weblog &#187; conStruct</title>
	<atom:link href="http://fgiasson.com/blog/index.php/category/structured-dynamics/construct/feed/" rel="self" type="application/rss+xml" />
	<link>http://fgiasson.com/blog</link>
	<description></description>
	<lastBuildDate>Tue, 06 Jul 2010 01:14:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Global structWSF Statistics Report</title>
		<link>http://fgiasson.com/blog/index.php/2010/04/09/global-structwsf-statistics-report/</link>
		<comments>http://fgiasson.com/blog/index.php/2010/04/09/global-structwsf-statistics-report/#comments</comments>
		<pubDate>Fri, 09 Apr 2010 15:53:18 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=1048</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Global structWSF Statistics Report&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2010-04-09&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2010/04/09/global-structwsf-statistics-report/&amp;rft.language=English"></span>
Today we released a simple structWSF nodes statistics report. It aggregates different statistics from all know (and accessible) structWSF nodes on the Web. It is still in its early stage, but aggregated statistics so far are quite interesting.
This global statistics reports has two aims:

Monitoring      the evolution of the usage of [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Global structWSF Statistics Report&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2010-04-09&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2010/04/09/global-structwsf-statistics-report/&amp;rft.language=English"></span>
<p><img class="alignright size-full wp-image-941" title="triple_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/triple_120.png" alt="triple_120" width="120" height="120" />Today we released a simple structWSF nodes statistics report. It aggregates different statistics from all know (and accessible) structWSF nodes on the Web. It is still in its early stage, but aggregated statistics so far are quite interesting.</p>
<p>This global statistics reports has two aims:</p>
<ol>
<li>Monitoring      the evolution of the usage of structWSF, and</li>
<li>Monitoring      the overall performance of structWSF web services in different setups for      different usages</li>
</ol>
<p><a href="http://openstructs.org/structwsf/stats/">The report is accessible here in all time</a>. The report is updated hourly.</p>
<h3>Overall Statistics</h3>
<p>The main statistics of the report are:</p>
<ul>
<li>The      number of structWSF nodes participating to the report</li>
<li>The      total number of HTTP queries processed by the structWSF nodes</li>
<li>The      total number of datasets created on the nodes</li>
<li>The      total number of records indexed, and</li>
<li>The      total number of triples indexed</li>
</ul>
<p>These statistics gives a general overview of the size of the “global structWSF network of nodes”.</p>
<h3>Web Service Statistics</h3>
<p>Each Web service endpoint has its own statistics, which are:</p>
<ul>
<li>The      number of queries processed by the web service</li>
<li>The      average time it took to process the query (without the network latency      between the requested and the web service endpoint server)</li>
<li>All      the requested mime-types, and the number of times a mime-type have been      requested, and</li>
<li>All      the HTTP response code returned by the endpoint</li>
</ul>
<p>These Web service specific statistics are helpful to have a general understanding of each web service endpoint.</p>
<p>The average time per query is helpful to know what kind of performance a developer should expect when using this web service endpoint.</p>
<p>The list of requested MIME types gives an overall usage of the web service endpoint: are users mostly requesting XML data, JSON data, RDF+XML data, etc. Such usage statistics is helpful to prioritize future development tasks.</p>
<p>The list of all HTTP response code is helpful to notice possible issues with a web service endpoint. If error codes are returned often, this could pinpoint a possible bug in the web service endpoint, an issue with its usage that could lead to a fix in the documentation, etc.</p>
<h3>Participating to the Global structWSF Statistics Report</h3>
<p>If you are operating a structWSF instance and want to participate to the Global structWSF Statistics Report, you first have to download the new <a href="http://code.google.com/p/structwsf/source/browse/branches/dev/statisticsBroker.php">statisticsBroker.php script</a> and install it on your structWSF node.</p>
<p>The statistics broker script is what calculates the statistics of a structWSF node, and what is used to aggregate statistics from all nodes, to generate the consolidated report.</p>
<p>The first thing to do is to edit the file, and to change the value of the $enableStatisticsBroadcast variable from FALSE to TRUE at the line 46. This will enable the script.</p>
<p>Normally you should install the script in the root folder of your structWSF node, but you can install it anywhere on your server, where it will be accessible on the Web.</p>
<p><a href="http://openstructs.org/structwsf/stats/subscribe/">The final step is to register your node to the reporting system</a>. It is just a matter of registering the URL address where the statisticsBroker.php script is accessible. It should be added to the global report within 24 hours, once I validated it.</p>
<h3>Other Usage of the Statistics Broker</h3>
<p>This is nice to participate to such global statistics report, but much more can be done with such a statistics broker.</p>
<p>A structWSF developer or a structWSF node maintainer could use it to have statistics of the local node. As described above, such statistics can be used to pinpoint possible performance issues, bottlenecks and possible bugs in web service endpoints. It could also be use to plan future extension of the network to scale some highly used web service endpoint in the network.</p>
<p>Additionally, the statistics broker could be used in a broader server maintenance architecture. It could be used in conjunction with another script to be part of a <a href="http://ganglia.sourceforge.net/">Ganglia</a> monitoring system for example. Performances could be monitored by Ganglia, rate of requests per hours, raise in the number different HTTP response returned by some web services. Additionally, each of these statistics could be bound to different alerts notification messages that would alert the structWSF system maintainers and developers of possible issues with the network.</p>
<h3>Next Step</h3>
<p>The next step with the statistics broker will be to create a structWSF web service out of it. That way, structWSF node maintainers will be easily able to define access and usage permissions for such statistics.</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2010/04/09/global-structwsf-statistics-report/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>structWSF Web Services Tutorial</title>
		<link>http://fgiasson.com/blog/index.php/2010/02/18/structwsf-web-services-tutorial/</link>
		<comments>http://fgiasson.com/blog/index.php/2010/02/18/structwsf-web-services-tutorial/#comments</comments>
		<pubDate>Thu, 18 Feb 2010 21:45:40 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[irON]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=1044</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=structWSF Web Services Tutorial&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=irON&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2010-02-18&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2010/02/18/structwsf-web-services-tutorial/&amp;rft.language=English"></span>
One thing that was hard to do with structWSF was explaining what structWSF is, and how users can interact with it. For most people, structWSF was abstracted behind conStruct and they didn’t know that each single functionalities of conStruct was bound to one, or multiple queries to one, or multiple, structWSF instance.
It is the reason [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=structWSF Web Services Tutorial&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=irON&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2010-02-18&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2010/02/18/structwsf-web-services-tutorial/&amp;rft.language=English"></span>
<p>One thing that was hard to do with <a href="http://openstructs.org/structwsf/">structWSF</a> was explaining what structWSF is, and how users can interact with it. For most people, structWSF was abstracted behind <a href="http://constructscs.com/">conStruct</a> and they didn’t know that each single functionalities of conStruct was bound to one, or multiple queries to one, or multiple, structWSF instance.</p>
<p>It is the reason why we took the time to write a complete structWSF interaction tutorial. This tutorial explains what the general structWSF architecture is, and it describes a series of general interaction usecases. We hope that this tutorial will helps developers and system implementators understanding the capabilities of structWSF and how they can use it.</p>
<p><a href="http://openstructs.org/structwsf/web-services-tutorial">You can read the complete structWSF Web Services Tutorial here.</a></p>
<p>Additionally, we released a new version of <a href="http://openstructs.org/blog/2010/2/fgiasson/structwsf-10a5-released">structWSF</a>, <a href="http://constructscs.com/blog/fgiasson/2010/2/construct-6x-1x-dev-5-released">conStruct</a> and the <a href="http://openstructs.org/blog/2010/2/fgiasson/irjson-parser-10a2-released">irJSON Parser</a> which are products of this toturial.</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2010/02/18/structwsf-web-services-tutorial/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>New versions of structWSF and conStruct</title>
		<link>http://fgiasson.com/blog/index.php/2010/01/20/new-versions-of-structwsf-and-construct/</link>
		<comments>http://fgiasson.com/blog/index.php/2010/01/20/new-versions-of-structwsf-and-construct/#comments</comments>
		<pubDate>Wed, 20 Jan 2010 22:42:34 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=1024</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=New versions of structWSF and conStruct&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2010-01-20&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2010/01/20/new-versions-of-structwsf-and-construct/&amp;rft.language=English"></span>

We just released a new (major) version of both structWSF and conStruct. Though some months had passed since we last released this software, we finally got the time and opportunity to make these important upgrades. Many things have changed in both packages. I don’t want to iterate all the changes in this blog post, so I [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=New versions of structWSF and conStruct&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2010-01-20&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2010/01/20/new-versions-of-structwsf-and-construct/&amp;rft.language=English"></span>
<p><img class="size-full wp-image-941 alignright" title="triple_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/triple_120.png" alt="triple_120" width="120" height="120" /><img class="alignright size-full wp-image-942" title="construct_logo_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/construct_logo_120.png" alt="construct_logo_120" width="120" height="120" /></p>
<p>We just released a new (major) version of both structWSF and conStruct. Though some months had passed since we last released this software, we finally got the time and opportunity to make these important upgrades. Many things have changed in both packages. I don’t want to iterate all the changes in this blog post, so I would suggest you to read the changes log files here:</p>
<ul>
<li><a href="http://community.openstructs.org/content/structwsf-10a4">structWSF      changes log</a></li>
<li><a href="http://community.openstructs.org/content/construct-6x-1x-dev-4">conStruct      changes log</a></li>
</ul>
<p>These new versions have greatly been impacted by the needs of our clients. We also started to introduce some new concepts we wrote about the last few months.</p>
<p>A really good addition to this release is the <a href="http://openstructs.org/structwsf/installation-guide">a brand new Installation Manual</a>. Hopefully people will be able to “easily” and properly install and setup a Web server to host these two packages.</p>
<p>All documentation files have been updated:</p>
<ul>
<li><a href="http://openstructs.org/structwsf/individual-ws-documentation">structWSF      Web Service Endpoints documentation</a></li>
<li><a href="http://openstructs.org/doc/code/structwsf/index.html">structWSF code      documentation</a></li>
<li><a href="http://constructscs.com/doc/code/construct/index.html">conStruct      code documentation</a></li>
</ul>
<p>You can download both software packages from here:</p>
<ul>
<li><a href="http://structwsf.googlecode.com/files/structwsf-1.0a4.zip">structWSF      version 1.0a4</a></li>
<li><a href="http://drupal.org/project/construct">conStruct version 6.x-1.x-dev-4</a> (Drupal should create the new package within 1 day)</li>
</ul>
<h2>An Amazon EC2/EBS Architecture</h2>
<p>Some of the changes to these new versions have been made to help create, setup and maintain Web servers that host structWSF and conStruct instances.</p>
<p>At Structured Dynamics, we have developed and use a server architecture that leverages Amazon computer-in-the-clouds services such as: EC2, EBS, Elastic IP in the Cloud. Such an architecture is giving us the flexibility to easily maintain and upgrade server instances, to instantly create new <strong>structWSF</strong> instances in one click (without performing all these steps everytime), etc.</p>
<p>You can contact us for more information about these EC2 AMIs and EBS Volumes that we developed for this purpose. Here is an overview of the architecture that is now in place:</p>
<p><img class="aligncenter size-full wp-image-1025" title="structwsf_amazon" src="http://fgiasson.com/blog/wp-content/uploads/2010/01/structwsf_amazon.png" alt="structwsf_amazon" width="501" height="446" /></p>
<p>There is a clear separation of concerns between three major things:</p>
<ul>
<li>Software &amp; libraries</li>
<li>Configuration files</li>
<li>Data files.</li>
</ul>
<p>We chose to put all software and libraries needed to create a stand-alone <strong>structWSF</strong> instance in an EC2 AMI. This means that all needed software to run a <strong>structWSF</strong> instance is present on the Virtuoso server running Ubuntu server.</p>
<p>Then we chose to put all configuration and data files on an EBS volume that we attach, and mount, on the EC2 instance. You can think about a EBS volume as a physical hard drive: it can be mounted on a server instance, but it can&#8217;t be shared between multiple instances.</p>
<p>By splitting the software &amp; libraries, configuration and data files, we make sure that we can easily upgrade a <strong>structWSF</strong> server in production with the latest version of <strong>structWSF</strong> (its code base and all related software such as Virtuoso, Solr, etc). Since the configuration and data files are not on the EC2 instance, we can easily create a new EC2 instance by using the latest <strong>structWSF</strong> AMI we produced, and then to mount the configuration and data files EBS volume on the new (and upgraded) <strong>structWSF</strong> instance. That way, in a few clicks, we can fully upgrade a server in production without fear of disturbing the configuration or data files.</p>
<p>Additionally, we can easily create backups of configuration and data files at different intervals by using Amazon&#8217;s Snapshot technology.</p>
<p>Finally, we chose to put all related software and configuration files needed to run a <strong>conStruct</strong> instance in another, separate, EBS volume. That way, we have a clean <strong>structWSF</strong> AMI instance that can be upgraded at any time, and we can <em>plug</em> (mount) a <strong>conStruct</strong> instance (EBS instance) into a <strong>structWSF</strong> server at any time. This means that we can easily have <strong>structWSF</strong> instances with or without a <strong>conStruct</strong> instance. The same strategy can easily be used to create <em>plugin packages</em> that can be mounted and unmounted to any <strong>structWSF</strong> instance at any time, depending on the needs.</p>
<p>All this makes <strong>structWSF</strong> server instances maintenance easier, simpler and faster.</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2010/01/20/new-versions-of-structwsf-and-construct/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>conStruct: a skin for structWSF</title>
		<link>http://fgiasson.com/blog/index.php/2009/08/12/construct-a-skin-for-structwsf/</link>
		<comments>http://fgiasson.com/blog/index.php/2009/08/12/construct-a-skin-for-structwsf/#comments</comments>
		<pubDate>Wed, 12 Aug 2009 21:10:51 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=946</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=conStruct: a skin for structWSF&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-08-12&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/08/12/construct-a-skin-for-structwsf/&amp;rft.language=English"></span>
As I said in my previous blog post, a conStruct instance is nothing more than a skin for one or multiple structWSF instances. conStruct is a user of a structWSF network.
But&#8230; what that means?
That means that each conStruct tools communicate with one or multiple structWSF instances. Each each feature of conStruct comes from structWSF. The [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=conStruct: a skin for structWSF&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-08-12&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/08/12/construct-a-skin-for-structwsf/&amp;rft.language=English"></span>
<p>As I said <a href="http://fgiasson.com/blog/index.php/2009/08/10/re-introduction/">in my previous blog post</a>, a <a href="http://constructscs.com">conStruct</a> instance is nothing more than a skin for one or multiple <a href="http://openstructs.org/structwsf/">structWSF</a> instances. conStruct is a <em>user</em> of a structWSF network.</p>
<p>But&#8230; what that means?</p>
<p>That means that each conStruct tools communicate with one or multiple structWSF instances. Each each feature of conStruct comes from structWSF. The only thing it does is presenting information to users, and give them some tool to manipulate the data.</p>
<h3>A structWSF instances network</h3>
<p><a href="http://openstructs.org/structwsf/individual-ws-documentation">A structWSF instance is a set of web service endpoints</a>. Each endpoint gets registered in a network. Each query sent to any of the web service endpoint of the network gets authenticated (and possibly rejected) by the network.</p>
<p>All structWSF instances share the same basic web services endpoints, however some specialized structWSF instance can add new functionality to the framework by developing new endpoints that does special things. Others can un-register services that has nothing to do with the mission of the instance, etc.</p>
<p>Not all structWSF instances are the same, but all of them share the same interface.</p>
<p>Individual people or organizations can choose to create structWSF nodes. The purposes can be quite different. Some organizations could choose to create structWSF nodes for internal purposes only: to help their departments to share different kind of data for example. Some people could want to setup a structWSF node where they can archive and share all data specific to their hobbies. Whatever the use-case is: they want a platform to ingest, manage, interact with and publish data; publicly or privately.</p>
<p style="text-align: center;"><a href="http://fgiasson.com/blog/wp-content/uploads/2009/08/structwsf_networks.png"><img class="alignnone size-medium wp-image-947 aligncenter" title="structwsf_networks" src="http://fgiasson.com/blog/wp-content/uploads/2009/08/structwsf_networks-300x158.png" alt="" width="300" height="158" /></a></p>
<p><!--[if gte vml 1]> <![endif]--></p>
<p>In the schema above, we can notice that different structWSF instances have been created and are maintained by different organizations, for different purposes. Some of the clients will communicate with these structWSF instances as a public user of the datasets published on the node(s), and other users will access to datasets that only them have access to.</p>
<p>As you can see, some users communicate with multiple structWSF instances. This means that these user cares about data of different datasets, maintained by different organizations. Why and what for? We don&#8217;t know. It can be for any reasons. It can be as a web portal that aggregates all the information about a specific domain that is shared amongst multiple nodes or it can be because the user get information from his client&#8217;s networks to get things done.</p>
<p>What is important to keep in mind with the schema above is that any kind of people, any kind of organizations and any kind of systems can leverage the <em>structured</em> data they have access to that is hosted by different organizations that make available different datasets and different web services endpoints (maybe some organizations can even create a web service endpoint that works with their dataset and to expose some special algorithms they use to disambiguate/tag entities, etc.)</p>
<h3>A network in action</h3>
<p>You are probably telling yourself: well, the grand vision is good&#8230; but where is the meat around the bone?</p>
<p>Lets take a look at the <a href="http://constructscs.com/demos">conStructSCS sandbox demo</a>. You have <a href="http://constructscs.com/conStruct/dataset/">two datasets in there: (1) the Sweet Tools and (2) RePEc</a>. There is one thing that you probably don&#8217;t notice: both datasets live on two different structWSF instances (each structWSF instance is hosted on a different web server). This means that if you perform a <a href="http://constructscs.com/conStruct/search/?query=rdf&amp;type=all&amp;dataset=all">search</a>, or a <a href="http://constructscs.com/conStruct/browse/">browse</a> query, all results you get in the conStruct user interface come from two totally different servers, with different data maintainers, hosted by different organizations, etc. Still, all results are displayed in the same user interface, which is the conStructSCS demo sandbox.</p>
<h3>Under the curtain</h3>
<p>Lets take a look at what is happening. First, run this <a href="http://constructscs.com/conStruct/search/?query=rdf&amp;type=all&amp;dataset=all&amp;wsf_debug=2">search query for &#8220;rdf&#8221;</a>. You see what appears in the yellow box? This is a list of the queries exchanged between conStruct and two structWSF instances. You want more? Try this other <a href="http://constructscs.com/conStruct/search/?query=rdf&amp;type=all&amp;dataset=all&amp;wsf_debug=1">search query for &#8220;rdf&#8221;</a>. Now you also have access to the body of the messages.</p>
<p>For this demo sandbox, we enabled the &#8220;wsf_debug&#8221; parameter so that users of the sandbox can see how a conStruct node can interact with structWSF instances. If the value of this URL parameter is &#8220;1&#8243;, then the header + body of the query is displayed to the users. If the value is &#8220;2&#8243;, only the header is displayed.</p>
<p>This means that you can happen the &#8220;&amp;wsf_debug=1&#8243; parameter to any URL of the demo sandbox and you will be able to see the messages exchanged between the systems. Why? Because <strong>all</strong> conStruct tools communicate with one or multiple web service endpoint(s) and one or multiple structWSF instances.</p>
<p>Now, lets take a look at the output of the search query above.</p>
<ul type="disc">
<li>Web service query: [[url: <strong>http://localhost/ws/search/</strong>] [method: post] [mime:      text/xml] [parameters: <a name="OLE_LINK15"></a>]      [execution time: <strong>0.279745101929</strong>]] (status: 200) OK &#8211; .</li>
<li>Web service query: [[url: <strong>http://bknetwork.org/ws/search/</strong>] [method: post] [mime:      text/xml] [parameters:      query=rdf&amp;types=all&amp;datasets=http%3A%2F%2Fbknetwork.org%2Fwsf%2Fdatasets%2F283%2F%3Bhttp%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F160%2F&amp;items=10&amp;page=0&amp;inference=on&amp;include_aggregates=true&amp;registered_ip=self%3A%3A0]      [execution time: <strong>0.289397001266</strong>]] (status: 200) OK &#8211; .</li>
<li>Web service query: [[url: <strong>http://localhost/ws/dataset/read/</strong>] [method: get] [mime:      text/xml] [parameters: uri=all&amp;registered_ip=self%3A%3A0] [execution      time: <strong>0.123399972916</strong>]] (status: 200) OK &#8211; .</li>
<li>Web service query: [[url: <a name="OLE_LINK14"></a><strong>/ws/dataset/read/</strong>] [method: get] [mime:      text/xml] [parameters: uri=all&amp;registered_ip=self%3A%3A0] [execution      time: <strong>0.18315911293</strong>]] (status: 200) OK &#8211; .</li>
</ul>
<p>Each dot is a query sent to a specific structWSF instance. For each query, you have this information:</p>
<ul type="disc">
<li>URL      of the web service endpoint where the query has been sent.</li>
<li>HTTP      method used to send the query</li>
<li>MIME      type (Accept HTTP header parameters) requested</li>
<li>Parameters      of the query</li>
<li>Time      it took to execute the query (including network latency &amp; query      processing)</li>
<li>Status      of the query from the web service endpoint</li>
</ul>
<p>Since this conStruct instance is linked to two different structWSF instances, the search tool will send a search query to two different search web service endpoints. Additionally, it will query these structWSF instances to get the description of the searched dataset (to display the proper name of the datasets in the user interface).</p>
<p>Each query is validated by the structWSF instances to make sure that they are legitimate queries. If they are, then results are returned. Once these queries are sent and answers received, the structSearch tool can then generate the page and display it to the user.</p>
<p>Do you want more? Here is a list of queries sent by different conStruct tools to different web services endpoints:</p>
<ul type="disc">
<li><a href="http://constructscs.com/conStruct/browse/?wsf_debug=2">Browse      tool: listing datasets to browse</a></li>
<li><a href="http://constructscs.com/conStruct/browse/?browse=true&amp;attribute=all&amp;type=all&amp;dataset=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F122%2F&amp;page=0&amp;wsf_debug=1">Browse      tool: browsing a specific dataset</a></li>
<li><a href="http://constructscs.com/conStruct/dataset/?wsf_debug=2">Dataset      tool</a></li>
<li><a href="http://constructscs.com/conStruct/view/?uri=http%3A%2F%2Fconstructscs.com%2FconStruct%2Fdatasets%2F122%2Fresource%2FCerebra_Server&amp;dataset=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F122%2F&amp;wsf_debug=2">View      page</a></li>
</ul>
<p><strong>(Note: this debug info tabs has been added so that people can see what is happening under the hood. However this information is only accessible to the registered conStruct instance and the administrator of that instance).</strong></p>
<h3>Do it by yourself, from your desktop computer</h3>
<p>I said that people or organizations that managed to create content data on these structWSF instances were able to manage/manipulate their data from anywhere: not only from within conStruct. Lets test this.</p>
<p>I changed the permissions on the Sweet Tools List dataset so that it is publicly available for reading. That way, any anyone will be able to send <a href="http://curl.haxx.se/">Curl</a> queries against the dataset, to that structWSF instance.</p>
<p>Now, lets try a couple of queries to different web services endpoints. Let start with a query for the keyword &#8220;rdf&#8221; on the Sweet Tools dataset:</p>
<p style="padding-left: 30px;"><em>curl -H &#8220;Accept: text/xml&#8221; &#8220;http://constructscs.com/ws/search/&#8221; -d &#8220;query=rdf&amp;types=all&amp;datasets=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F122%2F&amp;items=10&amp;inference=on&#8221;</em></p>
<p>What you will get for this query is a list of 10 instance records that match this query. You don&#8217;t like the internal XML representation of the system? Then try the internal JSON representation by running this query:</p>
<p><a name="OLE_LINK17"></a></p>
<p>Maybe this is not good enough for you? Then lets try in RDF+XML:</p>
<p style="padding-left: 30px;"><em>curl -H &#8220;Accept: application/rdf+xml&#8221; &#8220;http://constructscs.com/ws/search/&#8221; -d &#8220;query=rdf&amp;types=all&amp;datasets=http%3A%2F%2Fconstructscs.com%2Fwsf%2Fdatasets%2F122%2F&amp;items=10&amp;inference=on&#8221;</em></p>
<p>I think you understood the point here, so I won&#8217;t continue.</p>
<p>Now, lets send a query to get all the datasets accessible by you:</p>
<p style="padding-left: 30px;"><em>curl -H &#8220;Accept: application/rdf+xml&#8221; &#8220;http://constructscs.com/ws/auth/lister/&#8221; -d &#8220;mode=adataset&#8221;</em></p>
<p>If you can query all these things with Curl, this mean that anything can query these services. Standalone softwares can be developed to leverage these content nodes as well as other online applications.</p>
<h3>Conclusion</h3>
<p>As you probably learned with this blog post, one of the powers of structWSF is that it creates networks of structured content nodes that can be accessed by any thing, from anywhere, publicly or privately.</p>
<p>As you noticed, all this stuff is not only about integrating any kind of data, but also to publish it in a flexible way.</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2009/08/12/construct-a-skin-for-structwsf/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Re-Introduction</title>
		<link>http://fgiasson.com/blog/index.php/2009/08/10/re-introduction/</link>
		<comments>http://fgiasson.com/blog/index.php/2009/08/10/re-introduction/#comments</comments>
		<pubDate>Mon, 10 Aug 2009 21:46:42 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=945</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Re-Introduction&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-08-10&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/08/10/re-introduction/&amp;rft.language=English"></span>
I haven&#8217;t been active on this blog for more than half a year now. I was telling myself that I was too busy coding to write anything meaningful to my readers. I did write a couple of things, but nothing of importance related to all the things I was working on. I did publish announcements [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Re-Introduction&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-08-10&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/08/10/re-introduction/&amp;rft.language=English"></span>
<p>I haven&#8217;t been active on this blog for more than half a year now. I was telling myself that I was too busy coding to write anything meaningful to my readers. I did write a couple of things, but nothing of importance related to all the things I was working on. I did publish announcements and such, but didn&#8217;t really take the time to write about these things. A lot of things have been done and published recently, but little has been said. So, lets try to rectify the shot so that I share more about what I am currently working on, the concepts I am playing with, the systems I am releasing, etc. So, lets restart to write about these things that I really do believe in, and that I put all my time, efforts and energy in. Lets restart writing about things that I do believe in and that are valuable to me.</p>
<p>As you probably know, my company <a href="http://structureddynamics.com">Structured Dynamics</a> released a series of products: <a href="http://openstructs.org/structwsf/">structWSF</a> and <a href="http://constructscs.com">conStruct</a>. I spent the last six months developing these two products. However, what are they? Why did I spend all my time working on these products? Why does they matter? Why do I think that they are valuable?</p>
<p>Let me outline what they are, what they do and what they are useful at. Then think if they could be of any value to you, your organizations, your enterprises, etc.</p>
<h3>StructWSF</h3>
<p><a href="http://fgiasson.com/blog/wp-content/uploads/2009/06/triple_120.png"><img class="alignleft size-full wp-image-941" title="triple_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/triple_120.png" alt="" width="120" height="120" /></a><a href="http://openstructs.org/structwsf/">StructWSF</a> is a web services framework (WSF) that basically does four things: it ingest, manage, interact with and publish data. What kind data? Any kind of data</p>
<p><strong>Ingesting</strong>: the aim is to be able to ingest data from any data source (so data formatted using any language, or described using any vocabularies/schemas techniques). The framework has to be able to ingest any data that come from any data sources with a single conversion step.</p>
<p><strong>Managing</strong>: the aim is to be able to manage the data. Managing the data means being able to collectively (with permissions and authentication) manage datasets available in a framework instance. Being about the create, modify, delete or update data. It also means being able to browse and search the data. It means making it publicly available, or to restrict its access to a user or group of users. This means merging datasets together too.</p>
<p><strong>Interacting</strong>: but there is another facet to data management. We don&#8217;t only want to be able to manage data in a locked system. What we want is to be able to manage its data from anywhere. It can be from my browse, from my website, from some other applications on my desktop, from my home, from my office: from anywhere. All functions of a structWSF instance are accessible as web services endpoints. This means that you can perform any action, on your data, from anywhere you want: from a conStruct node or from a local Curl query. This is I think how people / organizations want to be able to manage the data they create and curate data.</p>
<p><strong>Publishing</strong>: like ingesting, we want to be able to publish, to communicate the data we create to other people, other organizations or other entities. We want to do this in such a way that these external entities doesn&#8217;t have to recreate/reinvent themselves. We want to be able to communicate data the way they understand it: using any format and any vocabulary/schema.</p>
<p>The mindset behind structWSF is the following: we can ingest any kind of data, we can manage that data in multiple ways, we can interact with that data from anywhere and we can publish-back this data in any ways. structWSF is friction less in the sense of data communication between systems, users and entities.</p>
<h3>conStruct</h3>
<p><a href="http://fgiasson.com/blog/wp-content/uploads/2009/06/construct_logo_120.png"><img class="alignright size-full wp-image-942" title="construct_logo_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/construct_logo_120.png" alt="" width="120" height="120" /></a><a href="http://constructscs.com">conStruct</a> is just a skin over one, or multiple, structWSF instances. The conStruct software is an example of how a system can interact with a structWSF data provider. conStruct is a suite of generic tools that can be used to search, browse, visualize (template), import, export, create, delete and update data. All these tools interact with one or multiple structWSF functions by using their web service endpoints.</p>
<p>Since conStruct can interact with a single structWSF instance, it can also interact with multiple structWSF instances. That means that conStruct can be a user interface that communicates with multiple data providers (structWSF instances) and display all the results, from all these providers, in a canonical user interface.</p>
<p>But as I said, conStruct is <em>one</em> skin over structWSF instances. We could think about the integration of structWSF into other CMS systems. We could even think about having different CMS systems integrating with the same structWSF instance(s) so that if one user update/create/delete some data, it appears in other CMS systems as well.</p>
<h3>The Magic Twist</h3>
<p>However, all this is done with a twist: everything is structured. This means that everything that is in the system has a structure: is described using some vocabularies (full blow ontologies; or naive vocabularies). This enable all kind of valuable functionalities: inferencing capabilities in search and browse activities, filtering on types and attributes, helps integrating different datasets from different systems and organizations.</p>
<p>This is the magic twist that make this system different: everything in there is structured in such a way that everything can be ingested and published in any format; in such a way that basic inferencing or more complex reasoning is possible. It integrates data and let users use it the way they want from where they are. The capabilities are there; use it if you need them.</p>
<h3>Next steps</h3>
<p>The next steps for me will be to describe the features of the system: how the data is managed, how permissions work, what is the granularity of permissions available, etc. These will be more technical blog posts, but they will give you the full potential of the systems and concepts I have been talking in this blog post.</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2009/08/10/re-introduction/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Release of structWSF, conStruct and the Community Web Site</title>
		<link>http://fgiasson.com/blog/index.php/2009/07/02/release-of-structwsf-construct-and-the-community-web-site/</link>
		<comments>http://fgiasson.com/blog/index.php/2009/07/02/release-of-structwsf-construct-and-the-community-web-site/#comments</comments>
		<pubDate>Thu, 02 Jul 2009 19:59:47 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=943</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Release of structWSF, conStruct and the Community Web Site&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-07-02&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/07/02/release-of-structwsf-construct-and-the-community-web-site/&amp;rft.language=English"></span>

The last few months have been challenging in term of amount of work to get done, in focusing on deliverables and in getting ready for the release of conStruct and structWSF sources codes, documentations, tutorials, web sites and demos.
I am now really happy to be able to finally announce the release of both software code [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=Release of structWSF, conStruct and the Community Web Site&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-07-02&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/07/02/release-of-structwsf-construct-and-the-community-web-site/&amp;rft.language=English"></span>
<p><!--StartFragment--></p>
<p class="MsoNormal">The last few months have been challenging in term of amount of work to get done, in focusing on deliverables and in getting ready for the release of <a href="http://constructscs.com">conStruct</a> and <a href="http://openstructs.org/structwsf">structWSF</a> sources codes, documentations, tutorials, web sites and demos.</p>
<p class="MsoNormal">I am now really happy to be able to finally announce the release of both software code sources along with a new <a name="OLE_LINK2"></a><a href="http://community.openstructs.org/"><span>development community website</span></a><span> where users and developers can exchange ideas about these two news projects.</span></p>
<p class="MsoNormal">The biggest milestone of the last months is now behind us. However, this is just the beginning of everything!</p>
<p class="MsoNormal">I think that many things have been written about these two projects already. I don’t want to write any tutorial at this point. So the only thing I will do right now is to point you the more relevant documentation, web sites, blog posts and demos about each project. The next step will be to write about specific use cases, features, etc.</p>
<p class="MsoNormal">
<h3>Community Web Site</h3>
<p class="MsoNormal">The <a href="http://community.openstructs.org">community Web site</a> is a place where developers and users of structWSF and conStruct can meet to talk about both projects, to report bugs and issues, to submit new enhancements, to find tips and tricks, etc.</p>
<p class="MsoNormal">I would suggest you to <a href="http://community.openstructs.org/user/register">create a new user profile on the community Web site</a> if you are interested in communicating with other members.</p>
<ul type="disc">
<li class="MsoNormal"><a href="http://community.openstructs.org/">Community Web site</a>
<ul type="circle">
<li class="MsoNormal"><a href="http://community.openstructs.org/forum">Discussion Forum</a></li>
<li class="MsoNormal"><a href="http://wiki.openstructs.org/wiki/Welcome">Wiki</a></li>
<li class="MsoNormal"><a href="http://community.openstructs.org/issues">Issues tracker</a></li>
<li class="MsoNormal"><a href="http://community.openstructs.org/source-code/code-repository">Core       source repositories</a></li>
<li class="MsoNormal"><a href="http://community.openstructs.org/source-code/documentation">Code       documentation</a></li>
</ul>
</li>
</ul>
<h3>structWSF</h3>
<p class="MsoNormal"><a href="http://openstructs.org/structwsf">structWSF</a> is a platform-independent Web services framework for accessing and exposing structured<span> </span>RDF data. Its central organizing perspective is that of the dataset. These datasets contain instance records, with the structural relationships amongst the data and their attributes and concepts defined via ontologies (schema with accompanying vocabularies).</p>
<p class="MsoNormal">The structWSF middleware framework is generally RESTful in design and is based on HTTP and Web protocols and open standards. The initial structWSF framework comes packaged with a baseline set of about a dozen Web services in CRUD, browse, search and export and import. All Web services are exposed via APIs and SPARQL endpoints. Each request to an individual Web service returns an HTTP status and optionally a document of resultsets. Each results document can be serialized in many ways, and may be expressed as either RDF or pure XML.</p>
<ul type="disc">
<li class="MsoNormal"><a name="OLE_LINK7"></a><a href="http://openstructs.org/structwsf"><span>Main Web site</span></a>
<ul type="circle">
<li class="MsoNormal"><a href="http://openstructs.org/downloads"><span>Download</span></a></li>
<li class="MsoNormal"><a href="http://openstructs.org/structwsf/architecture"><span>Architecture</span></a></li>
<li class="MsoNormal"><a href="http://openstructs.org/structwsf/individual-ws-documentation"><span>RESTful endpoints documentation</span></a></li>
<li class="MsoNormal"><a href="http://openstructs.org/doc/code/structwsf/index.html"><span>Source code documentation</span></a></li>
<li class="MsoNormal"><span><a name="OLE_LINK1"></a></span><a href="http://wiki.openstructs.org/wiki/Blog_Posts"><span><span>Interesting       blog posts</span></span></a></li>
<li class="MsoNormal"><a href="http://wiki.openstructs.org/wiki/StructWSF_Installation"><span>Installation manual (early draft)</span></a></li>
</ul>
</li>
</ul>
<p class="MsoNormal">
<h3>conStruct</h3>
<p class="MsoNormal"><a href="http://constructscs.com">conStruct</a> is a distro of the Drupal framework that aims to set a new standard in data integration and as a structured content system (SCS). With conStruct, you can let your data and its structure drive your applications. You can easily interoperate your diverse internal information with public content on the Web. And you can leverage a platform designed from the ground up for knowledge management and collaboration.</p>
<ul type="disc">
<li class="MsoNormal"><a name="OLE_LINK3"></a><a name="OLE_LINK4"></a><a href="http://constructscs.com/"><span><span>Main Web site</span></span></a>
<ul type="circle">
<li class="MsoNormal"><a href="http://constructscs.com/downloads"><span><span>Download</span></span></a></li>
<li class="MsoNormal"><a href="http://constructscs.com/features/design-overview"><span><span>Design       overview</span></span></a></li>
<li class="MsoNormal"><a href="http://constructscs.com/doc/code/construct/index.html"><span><span>Source       code documentation</span></span></a></li>
<li class="MsoNormal"><a href="http://constructscs.com/features"><span><span>Current features</span></span></a></li>
<li class="MsoNormal"><a href="http://constructscs.com/demos"><span><span>Online demos</span></span></a></li>
<li class="MsoNormal"><a href="http://constructscs.com/documentation/instructions"><span><span>Tools       instructions manuals</span></span></a></li>
<li class="MsoNormal"><a href="http://wiki.openstructs.org/wiki/Blog_Posts"><span><span>Interesting       blog posts</span></span></a></li>
<li class="MsoNormal"><a href="http://constructscs.com/doc/code/construct/index.html"><span><span>Installation       manual (early draft)</span></span></a></li>
</ul>
</li>
</ul>
<p><!--EndFragment--></p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2009/07/02/release-of-structwsf-construct-and-the-community-web-site/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>structWSF and conStruct websites unveiled</title>
		<link>http://fgiasson.com/blog/index.php/2009/06/16/structwsf-and-construct-websites-unveiled/</link>
		<comments>http://fgiasson.com/blog/index.php/2009/06/16/structwsf-and-construct-websites-unveiled/#comments</comments>
		<pubDate>Tue, 16 Jun 2009 20:30:39 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=936</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=structWSF and conStruct websites unveiled&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-06-16&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/06/16/structwsf-and-construct-websites-unveiled/&amp;rft.language=English"></span>
I am proud to announce the release the websites of two of our products to come: structWSF and conStruct. Both products will be available in open source under the Apache 2 license. Mike just unveiled and demoed the two projects in his talk at SemTech 2009.
As we describe them on Structured Dynamics&#8216; website:
structWSF
structWSF  is [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=structWSF and conStruct websites unveiled&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-06-16&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/06/16/structwsf-and-construct-websites-unveiled/&amp;rft.language=English"></span>
<p>I am proud to announce the release the websites of two of our products to come: <a href="http://openstructs.org">structWSF</a> and <a href="http://constructscs.com">conStruct</a>. Both products will be available in open source under the Apache 2 license. <a href="http://mkbergman.com">Mike</a> just unveiled and demoed the two projects in <a href="http://www.semantic-conference.com/session/1806/">his talk at SemTech 2009</a>.</p>
<p>As we describe them on <a href="http://structureddynamics.com/">Structured Dynamics</a>&#8216; website:</p>
<h2>structWSF</h2>
<p><a href="http://fgiasson.com/blog/wp-content/uploads/2009/06/triple_120.png"><img class="alignleft size-full wp-image-941" title="triple_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/triple_120.png" alt="" width="120" height="120" /></a><a href="http://openstructs.org">structWSF </a> is a platform-independent Web services framework for accessing and exposing structured  RDF data. Its central organizing perspective is that of the dataset. These datasets contain instance records, with the structural relationships amongst the data and their attributes and concepts defined via ontologies (schema with accompanying vocabularies).</p>
<p>The structWSF middleware framework is generally RESTful in design and is based on HTTP and Web protocols and open standards. The initial structWSF framework comes packaged with a baseline set of about a dozen Web services in CRUD, browse, search and export and import.</p>
<p>All Web services are exposed via APIs and SPARQL endpoints. Each request to an individual Web service returns an HTTP status and optionally a document of resultsets. Each results document can be serialized in many ways, and may be expressed as either RDF or pure XML.</p>
<p>In initial release, structWSF has direct interfaces to the <a href="http://virtuoso.openlinksw.com/wiki/main/Main/">Virtuoso</a> RDF triple store (via ODBC, and later HTTP) and the <a href="http://lucene.apache.org/solr/">Solr</a> faceted, full-text search engine (via HTTP). However, structWSF has been designed to be fully platform-independent. Support for additional datastores and engines is planned. The design also allows other specialized systems to be included, such as analysis or advanced inference engines.</p>
<p>The framework is open source (Apache 2 license) and designed for extensibility. structWSF and its extensions and enhancements are distributed and documented on the OpenStructs Web site.</p>
<h2><a href="http://fgiasson.com/blog/wp-content/uploads/2009/06/construct_logo_120.png"><img class="alignleft size-full wp-image-942" title="construct_logo_120" src="http://fgiasson.com/blog/wp-content/uploads/2009/06/construct_logo_120.png" alt="" width="120" height="120" /></a>conStruct</h2>
<p><a href="http://constructscs.com">conStruct SCS</a> is a structured content system that extends the basic <a href="http://drupal.org/">Drupal</a> content management framework. conStruct  enables structured data and its controlling vocabularies (ontologies) to drive applications and user interfaces.</p>
<p>Users and groups can flexibly access and manage any or all datasets exposed by the system depending on roles and permissions. Report and presentation templates are easily defined, styled or modified based on the underlying datasets and structure. Collaboration networks can readily be established across multiple installations and non-Drupal endpoints. Powerful linked data integration can be included to embrace data anywhere on the Web.</p>
<p>Depending on roles and permissions, a given user may or may not see specific datasets or tools within the Drupal interface. Search and browse results are similarly sequestered depending on access rights.</p>
<p>conStruct provides Drupal-level CRUD (create &#8211; read &#8211; update &#8211; delete), data display templating, faceted browsing, full-text search, and import and export over structured data stores based on RDF. It also provides a system for additional tools additions and expansions for this structured data. conStruct SCS is built on the platform-independent structWSF Web services framework.</p>
<p>Like Drupal and structWSF, conStruct is free and open source (GPL license). Versions of conStruct SCS are planned to adopt it to other content management systems (CMS).</p>
<h2>Next</h2>
<p>The alpha version of the code with all the proper documentation will be released later this summer. Everybody will be able to contribute to the project by enhancing/developing the core code or by extending it with new modules and web services.  Stay tuned!</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2009/06/16/structwsf-and-construct-websites-unveiled/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RDF Aggregates and Full Text Search on Steroids with Solr</title>
		<link>http://fgiasson.com/blog/index.php/2009/04/29/rdf-aggregates-and-full-text-search-on-steroids-with-solr/</link>
		<comments>http://fgiasson.com/blog/index.php/2009/04/29/rdf-aggregates-and-full-text-search-on-steroids-with-solr/#comments</comments>
		<pubDate>Wed, 29 Apr 2009 20:46:07 +0000</pubDate>
		<dc:creator>Fred</dc:creator>
				<category><![CDATA[Semantic Web]]></category>
		<category><![CDATA[Structured Dynamics]]></category>
		<category><![CDATA[conStruct]]></category>
		<category><![CDATA[structWSF]]></category>

		<guid isPermaLink="false">http://fgiasson.com/blog/?p=923</guid>
		<description><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=RDF Aggregates and Full Text Search on Steroids with Solr&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-04-29&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/04/29/rdf-aggregates-and-full-text-search-on-steroids-with-solr/&amp;rft.language=English"></span>
Preamble
As I explained in my latest blog post, I am now starting to talk about a couple of things I have been working on in the last few months that will lead to a release, by Structured Dynamics, in the coming months. This blog post is the first step into that path. Enjoy!
Introduction
I have been [...]]]></description>
			<content:encoded><![CDATA[	
	<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.title=RDF Aggregates and Full Text Search on Steroids with Solr&amp;rft.aulast=Giasson&amp;rft.aufirst=Frédérick&amp;rft.subject=Semantic Web&amp;rft.subject=Structured Dynamics&amp;rft.subject=conStruct&amp;rft.subject=structWSF&amp;rft.source=Frederick Giasson&#8217;s Weblog&amp;rft.date=2009-04-29&amp;rft.type=blogPost&amp;rft.format=text&amp;rft.identifier=http://fgiasson.com/blog/index.php/2009/04/29/rdf-aggregates-and-full-text-search-on-steroids-with-solr/&amp;rft.language=English"></span>
<h3><strong>Preamble</strong></h3>
<p>As I explained in my latest blog post, I am now starting to talk about a couple of things I have been working on in the last few months that will lead to a release, by <a href="http://structureddynamics.com">Structured Dynamics</a>, in the coming months. This blog post is the first step into that path. Enjoy!</p>
<h3><strong>Introduction</strong></h3>
<p>I have been working with RDF, SPARQL and triple stores for years now. I have created many prototypes and online services using these technologies. Having the possibility to describe everything with RDF, and having the possibility to index everything in a triple store that you can easily query the way you want using SPARQL, is priceless. Using RDF saves development and maintenance cost because of the flexibility of store (triple store), the query language (SPARQL), and associated schemas (ontologies).</p>
<p>However, even if this set of technologies can do everything, quickly and efficiently, it is not necessarily optimal for all tasks you have to do. As we will see in this blog post, we use RDF for describing, integrating and managing any kind of data (structured or unstructured) that exists out there. RDF + Ontologies are what we use as the canonical expression of any kind of data. It is the triple store that we use to aggregate, index and manage that data, from one or multiple data sources. It is the same triple store that we use to feed any other system that can be used in our architecture. The triple store is the data orchestrator in any such architecture.</p>
<p>In this blog post I will show you how this orchestrator can be used to create <a href="http://lucene.apache.org/solr/">Solr</a> indexes that are used in the architecture to perform three functions that Solr has been built to perform optimally: full-text search, aggregates and filtering. So, while a triple store can perform these functions, it is not optimal for what we have to do.</p>
<h3><strong>Overview</strong></h3>
<p>The idea is to use the RDF data model and a triples store to populate the Solr schema index. We leverage the powerful and flexible data representation framework (RDF), in conjunction with the piece of software that lets you do whatever you want with that data (Virtuoso), to feed a carefully tailored Solr schema index to optimally perform three things: full-text search, aggregates and filtering. Also, we want to leverage the ontologies used to describe this data to be able to infer things vis-ŕ-vis these indexed resources in Solr. This leverage enables us to use inference on full-text search, aggregates and filtering, in Solr! This is quite important since you will be able to perform full text searches, filtered by types that are inferred!</p>
<p>Some people will tell me that they can do this with a traditional relational database management system: yes. However, RDF + SPARQL + Triple Store is so powerful to integrate any kind of data, from any data sources; it is so flexible that it saves precious development and maintenance resources: so money.</p>
<h3>Solr</h3>
<p>What we want to do is to create some kind of &#8220;RDF&#8221; Solr index. We want to be able to perform full-text searches on RDF literals; we want to be able to aggregate RDF resources by the properties that describe them, and their types; and finally we want to be able to do all the searches, aggregation and filtering using inference.</p>
<p>So the first step is to create the proper Solr schema that will let you do all these wonderful things.</p>
<p><a href="http://code.google.com/p/structwsf/source/browse/trunk/framework/solr_schema.xml">The current Solr index schema can be downloaded here.</a> <em>(View source if simply clicking with your browser.)</em></p>
<p>Now, let&#8217;s discuss this schema.</p>
<h3>Solr Index Schema</h3>
<p>A Solr schema is composed of basically two things: fields and type of fields. For this schema, we only need two types of fields: string and text. If you want more information about these two types, I would refer you to the <a href="http://lucene.apache.org/solr/">Solr documentation</a> for a complete explanation of how they work. For now, just consider them as strings and texts.</p>
<p><!--StartFragment-->What interests us is the list of defined fields of this schema (again, see <a href="http://code.google.com/p/structwsf/source/browse/trunk/framework/solr_schema.xml">download</a>):</p>
<ul type="disc">
<li><em>uri</em> [1] &#8211; Unique resource identifier      of the record</li>
<li><em>type </em><span>[1-N]</span><!--EndFragment--> &#8211; Type of the record</li>
<li><em>inferred_type </em> <!--StartFragment--><span>[0-N]</span><!--EndFragment--> &#8211; Inferred type of the record</li>
<li><em>property</em> [0-N] &#8211;      Property identifier used to describe the resource and that has a literal      as object</li>
<li><em>text </em><span>[0-N] (same number as <em>property</em></span><span>)</span><!--EndFragment--> &#8211; Text of the literal of the      property</li>
<li><em>object_property</em> [0-N] &#8211;      Property identifier used to describe the resource where the object is a      reference to another resource and that this other resource can be      described by a literal</li>
<li><em>object_label</em> [0-N]      (same number as <em>object_property</em>) &#8211; Text      used to refer to the resource referenced by the <em>object_property</em></li>
</ul>
<h3>Full Text Search</h3>
<p>A RDF document is a set of multiple triples describing one or multiple resources. Saying that you are doing full-text searches on RDF documents is certainly not the same thing as saying that you are doing full-text searches on traditional text documents. When you describe a resource, you rarely have more than a couple of strings, with a couple of words each. It is generally the name of the entity, or a label that refers to it. You will have different numbers, and sometimes some description (a short biography, or definition, or summary, as examples). However, except if you index an entire text document, the &#8220;textual abundance&#8221; is quite poor compared to an indexed corpus of documents.</p>
<p>In any case, this doesn&#8217;t mean that there are no advantages in doing full-text searches on RDF documents (so, on RDF resource descriptions). But, if we are going to do so, let&#8217;s do so completely, and in a way that meets users&#8217; expectations for full-text document search.  By applying this mindset, we can apply some cool new tricks!</p>
<p>Intuitively the first implementation of a full-text search index on RDF documents would simply make a key-value pair assignment between a resource URI and its related literals. So, when you perform a full-text search for &#8220;Bob&#8221;, you get a reference on all the resources that have &#8220;Bob&#8221; in one of the literals that describe these resources.</p>
<p>This is good, but this is not enough. This is not enough because this breaks the more basic behavior for any users that uses full-text search engines.</p>
<p>Let&#8217;s say that I know the author of many articles is named &#8220;Bob Carron&#8221;. I have no idea what are the titles of the articles he wrote, so I want to search for them. With the system exposed above, if I do a search for &#8220;Bob Carron&#8221;, I will most likely get back as a result the reference to &#8220;Bob Carron&#8221;, the author person. This is good, but this is not enough.</p>
<p>On the results page, I want the list of all articles that Bob wrote! Because of the nature of RDF, I don&#8217;t have this &#8220;full-text&#8221; information of &#8220;Bob&#8221; in the description of the articles he wrote. Most likely, in RDF, Bob will be related to the articles he wrote by reference (object reference with the URIs of these articles), <em>i.e.</em>, &lt;this-article&gt; &lt;author&gt; &lt;bob-uri&gt;. As you can notice, we won&#8217;t get back any articles in the resultset for the full-text query &#8220;Bob Carron&#8221; because this textual information doesn&#8217;t exist in the index at the level of the articles he wrote!</p>
<p>So, what can we do?</p>
<p>A simple trick will beautifully do the work. When we create the Solr index, what we want is to add the textual information of the resources being referenced by the indexed resources. For example, when we create the Solr document that describes one of the articles written by Bob, we want to add the literal that refers to the resource(s) referenced by this article. In this case, we want to add the name of the author(s) in the full-text record of that article. So, with this simple enhancement, if we do a search for &#8220;Bob Carron&#8221;, we will now get the list of all resources that refers to Bob too! (articles he wrote, other people that know him, etc).</p>
<p style="text-align: center;"><a href="http://fgiasson.com/blog/wp-content/uploads/2009/04/text69217.png"><img class="size-medium wp-image-932 aligncenter" title="object property" src="http://fgiasson.com/blog/wp-content/uploads/2009/04/text69217-300x129.png" alt="" width="300" height="129" /></a></p>
<p>So, this is the goal of the &#8220;object_property&#8221; and &#8220;object_label&#8221; fields of the Solr index. In the schema above, the &#8220;object_property&#8221; would be &#8220;author&#8221; and the &#8220;object_label&#8221; would be &#8220;Bob Carron&#8221;. This information would belong to the Solr document of the <em>Article 1</em>.</p>
<h3>Full Text Search Prototype</h3>
<p>Let&#8217;s take a look at the prototype running system (see screen capture below).</p>
<p>&#65279;&#65279;<a href="http://fgiasson.com/blog/wp-content/uploads/2009/04/search.gif"></a></p>
<p style="text-align: center;"><img class="size-medium wp-image-933" title="search" src="http://fgiasson.com/blog/wp-content/uploads/2009/04/search-300x210.gif" alt="" width="300" height="210" /></p>
<p><!--[if gte vml 1]> <![endif]--></p>
<p>The dataset loaded in this prototype is <a href="http://www.mkbergman.com/?page_id=325">Mike&#8217;s Sweet Tools</a>. As you notice in the prototype screen, many things can be done with the simple Solr schema we published above. Let&#8217;s start with a search for the word &#8220;test&#8221;. First, we are getting a resultset of 17 things that have the &#8220;test&#8221; word in any of their text-indexed fields.</p>
<p>What is interesting with that list is the additional information we now have for each of these resultsets that come from the RDF description of these things, and the ontologies that have been used to describe them.</p>
<p>For example, if we take a look at Result #4, we see that the word &#8220;test&#8221; has been found in the <strong><em>description</em></strong> of the <strong><em>Ontology project </em></strong>for the &#8220;TONES  Ontology Repository&#8221; record. Isn&#8217;t that precision far more useful than saying: the word &#8220;test&#8221; has been found in &#8220;this webpage&#8221;? I&#8217;ll let you think about it.</p>
<p>Also, if we take a look at Result #1, we know that the word &#8220;test&#8221; has been found in the <strong><em>homepage</em></strong> of the <strong><em>Data Converter Project</em></strong> for the&#8221;Talis Semantic Converter&#8221; record.</p>
<p>Additionally, by leveraging this Solr index, we can do efficient aggregates on the types of the things returned in the resultset for further filtering. So, in the section &#8220;Filter by kinds&#8221; we know what kinds of things are returned for the query &#8220;test&#8221; against this dataset.</p>
<p>Finally, we can use the drop-down box at the right to do a new search (see screenshot), based on the specific kind of things indexed in the system. So, I could want to make a new search, only for &#8220;Data specification projects&#8221; with the keyword &#8220;rdf&#8221;. I already know from the user interface that there are 59 such projects.</p>
<p>All this information comes form the Solr index at query time, and basically for free by virtue of how we set up the system. Everything is dynamically aggregated and displayed to the user.</p>
<p>However, there are a few things that you won&#8217;t notice here that are used:  1) SPARQL queries to the triple store to get some more information to display on that page; 2) the use of inference (more about it below), and; 3) the leveraging of the ontologies descriptions.</p>
<p>In any case, on one of SD&#8217;s test datasets of about 3 million resources, such a page is generated within a few hundred milliseconds: resultset, aggregates, inference and description of things displayed on that page.  This same 3 million resources that returns results in a few hundred milliseconds did so on a small Amazon EC2 server instance for 10 cents per hour. How&#8217;s that for performance?!</p>
<h3>Aggregates and Filtering on Properties and Types</h3>
<p>But, we don&#8217;t want to merely do full-text search on RDF data. We also want to do aggregates (how many records has this type, or this property, etc.) and filtering, at query time, in a couple of milliseconds. We already had a look at these two functions in the context of a full-text search. Now let&#8217;s see it in action in some dataset prototype browsing tools that uses the same Sweet Tools dataset.</p>
<p>In a few milliseconds, we get the list of different kind of things that are indexed in a given dataset. We can know what are the types, and what is the count for each of these types. So, the ontologies drive the taxonomic display of the list of things indexed in the dataset, and Solr drives the aggregation counts for each of these types of things.</p>
<p>Additionally, the ontologies and the <a href="http://virtuoso.openlinksw.com/wiki/main/Main/">Virtuoso</a> inference rules engine are used to make the count, by inference. If we take the example of the type &#8220;RDF project&#8221;, we know there are 49 such projects. However, not all these projects are explicitly typed with the &#8220;RDF project&#8221; type. In fact, 7 of these &#8220;RDF project&#8221; are &#8220;RDF editor project&#8221; and 6 are &#8220;RDF generator project&#8221;.</p>
<p>This is where inference can play an important role: an article is a document. If I browse documents, I want to include articles as well. This &#8220;broad context retrieval&#8221; is driven by the description of the ontologies, and by inference; this is the same thing for these projects; and this is the same thing for everything else that is stored as structured RDF and characterized by an ontology.</p>
<p align="center"><!--[if gte vml 1]> <![endif]--></p>
<p style="text-align: center;"><a href="http://fgiasson.com/blog/wp-content/uploads/2009/04/browse_tree.gif"><img class="size-medium wp-image-934" title="browse_tree" src="http://fgiasson.com/blog/wp-content/uploads/2009/04/browse_tree-131x300.gif" alt="" width="131" height="300" /></a></p>
<p>The screenshot above shows how these inferences and their nestings could present themselves in a user interface.</p>
<p>Once the user clicks on one of these types, he starts to browse all things of that type. On the next screenshot below, Solr is used to add filters based on the attributes used to describe these things.</p>
<p><!--[if gte vml 1]> <![endif]--></p>
<p style="text-align: center;"><a href="http://fgiasson.com/blog/wp-content/uploads/2009/04/browse_properties_filter.gif"><img class="size-medium wp-image-935" title="browse_properties_filter" src="http://fgiasson.com/blog/wp-content/uploads/2009/04/browse_properties_filter-300x185.gif" alt="" width="300" height="185" /></a></p>
<p>In some cases, I may want to see all the Projects that have a review. To do so, I would simply add this filter criteria on the browsing page and display the &#8220;Projects&#8221; that have a &#8220;review&#8221; of them. And thanks to Solr, I already know how many such Projects have reviews, right before even taking a look at them.</p>
<p>Note, then, on this screenshot that the filters and counts come from Solr.  The list of the actual items returned in the resultset comes from a SPARQL query, and the name of the types and properties (and their descriptions) come from the description of the ontologies used.</p>
<p>This is what all this stuff is about: creating a symbiotic environment where all these wonderful systems live together to do the effective management of the structured data.</p>
<h3>Populating the Solr Index</h3>
<p>Now that we know how to use Solr to perform full-text searches, and the aggregating and filtering of structured data, one question still remains: how do we populate this index? As stated at above, the goal is to manage all the structured data of the system using a triple store and ontologies. Then it is to use this triple store to populate the Solr index.</p>
<p>Structured Dynamics uses the Virtuoso Open Source as the triple store to populate this index for multiple reasons. One of the main ones is for its performance and its capability to do efficient basic inference. The goal is to send the proper SPARQL queries to get the structured data that we will index in the Solr schema index that we talked about above. Once this is done, all the things that I talked about in this blog post become possible, and efficient.</p>
<h3>Syncing the Index</h3>
<p>However, in such a setup, we have to keep one thing in mind: each time the triple store is updated (a resource is created, deleted or updated), we have to sync the Solr index according to these modifications.</p>
<p>What we have to do is to detect any change in the triple store, and to reflect this change into the Solr index. What we have to do is to re-create the entire Solr document (the resource that changed in the triple store) using the &lt;add /&gt; operation.</p>
<p>This design raises an issue with using Solr: we cannot simply modify one field of a record. We have to re-index the entire description of the document even if we want to modify a single field of any document. This is a limitation of Solr that is currently <a href="file://localhost/jira/browse/SOLR-139">addressed in this new feature proposition</a>; but it is not currently available for prime time.</p>
<p>Another thing to consider here is to properly sync the Solr index with any ontology changes (at the level of the class description) if you are using the inference feature. For example, assume you have an ontology that says that class A is a sub-class-of class B. Then, assume the ontology is refined to say that class A is now a sub-class-of class C, which itself is a sub-class-of class B. To keep the Solr index synced with the triple store, you will have to perform all modifications that affect all the records of these types. This means that the synchronization doesn&#8217;t only occur at the level of the description of a record; but also at the level of the changes in the ontologies used to describe those records.</p>
<h3>Conclusion</h3>
<p>One of the main things to keep in mind here is that now, when we develop Web applications, we are not necessarily talking about a single software application, but a group of software applications that compose an architecture to deliver a service(s). In any such architecture, what is at the center of it is <em>Data</em>.</p>
<p>Describing, managing, leveraging and publishing this data is at the center of any Web service. It is why it is so important to have the right flexible data model (RDF), with the right flexible query language (SPARQL), and the right data management system (triple store) in place. From there, you can use the right tools to make it available on the Web to your users.</p>
<p>The right data management system is what should be used to feed any other specific systems that compose the architecture of a Web service. This is what we demonstrated with Solr; but it is certainly not limited to it.</p>
]]></content:encoded>
			<wfw:commentRss>http://fgiasson.com/blog/index.php/2009/04/29/rdf-aggregates-and-full-text-search-on-steroids-with-solr/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
	</channel>
</rss>
