Soon enough I’ll add a SPARQL endpoint to the Ping the Semantic Web service. What it means?
It means that anybody will be able to send SPARQL queries (SPARQL look-like the SQL query language but is used to query RDF graphs) to retrieve information from the RDF documents know by the web service. As soon as someone ping pingthesemanticweb.com with a RDF document’s URL, other people will be able to search it using the SPARQL endpoint.
How it will work?
Users will have access web interface where they will be able to write and send SPARQL queries to the triple store (this is the name given to the type of database systems that archive RDF graphs)
For example, they will be able to send queries like:
PREFIX sioc: <http://rdfs.org/sioc/ns#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
?s rdf:type sioc: Post
That query to the triple store will return all the resources (things) that have been described (type) as a sioc: Post (a blog post, a forum post, etc.)
How to visualize the triples store?
Creating this SPARQL endpoint will be somewhat easy to do. In fact, the structure will remain the same but we will add one new server: a SPARQL endpoint that gives access to a RDF triple store.
There is how one could imagine how triple store works:
If we take a look at the schemas, each RDF document is a graph in itself. A RDF graph is composed of the relations between resources <subject , verb, object>. For example a relation could be <peter , hair-color, brown> (so Peter’s air color is brown (so the resource “Peter” has the property “hair-color” brown)).
With the triple store, we have the possibility to merge two RDF graphs together. That way, we create a sort of meta-graph with all the relations between one graph and the other.
This is where things are getting interesting.
Ping the Semantic Web’s graph will be created by merging the graph of each RDF documents it knows (via pinging).
That way, users will have the possibility to search this sort of Meta-Graph of relationship between resources by querying it using SPARQL queries.
We could possibly talk about the semantic web in a nutshell.
Virtuoso to create the RDF triple store
I’ll use a database management system called Virtuoso to create this RDF triples store.
A first prototype version
Consider the first version of the triple store as a prototype. In fact, the RDF triple store feature of Virtuoso is relatively new. It is always in development and some things have to be created (to enhance the functionality) and upgraded. However, it is perfect for a couple of hundred of millions of triples (relations), but when we will reach the billion of triples, it is possible that some queries to the system will become unworkable. At that time, I’ll possibly be obligated to restrict users’ requests possibilities to ensure that the system will always be working at its full potential.
In any case, the triple store and the SPARQL endpoint will “live” on another server, so the performance of the current pinging system will not be affected by the performance of the endpoint, they are two totally different entities in our system.
Why a triple store with a SPARQL endpoint?
At first: for research and education purposes. People will have the possibility to query a system that aggregate RDF documents “from the wild”. Eventually, such initiative could lead to more interesting technologies development (user interface, anything) that could be used by a broader range of people.
Having this system in hands, one could search the triple store to extract statistics on the RDF documents it knows for research purposes.
Also, it is a way for OpenLink to debug, upgrade and enhance its service that will ultimately benefit to everyone (since an open source version of Virtuoso is available).
Keep me in touch if you have any thoughts about that new development with the Ping the Semantic Web service.