What is a dynamic data web page? It is a shape-shifter data source. That is it. It is a source of data that will change its shape depending on the request that has been made on the data source.
Shapes of the data source
The data source will shape the format of its output depending on what you need. If you are a human, you would like to have something that you can read and understand like a HTML web page. However, if you are a web service, you would probably like to get the data in a shape that you could easily understand such as RDF, XML, JSON, etc.
It is as simple as that: a Dynamic Data Web Page is web page that will outputs data in different formats depending on what the requesting users wants.
There are many formats:
- HTML – Human readable
- Others could be easily implemented if needed.
In Dynamic Data Web Page there is: a Web Page and Data
A DDWP is two things:
- A Web Page: as we saw above, it is a way to present/publish the data of the source formatted in some way.
- Data: as we will see bellow, it is the source of data.
This said a DDWP is nothing else than a source of data published in some ways.
Dissection of a Dynamic Data Web Page
0. Creation of the data source. The preliminary step for the data source (triple store) is to continually index RDF data sources. If we are talking about a generic service, then it should aggregate RDF data from everywhere: the Web, specialized databases such as Musicbrainz, Wikipedia, etc. If it is a specialized system such as the products catalogue of a company, it should constantly synch its triple store with the catalogue. This constant operation will create a valuable data source.
1. Creation of a SPARQL query. An end user wants information. This end user can be anything: a user, a developer, a web service, etc. This user will build a SPARQL query that will returns the results from the data source.
2. Saving the SPARQL query. The SPARQL query will then be saved on the web server of the service. As soon as the query will be saved on the server.
3. Assigning a URL to the SPARQL query. A URL will be assigned by the web server for that saved SPARQL query. From there, anybody will be able to access to the results of the query by looking at that URL.
4. Accessing the URL
4.a. Sending the HTTP query. In our example, a web service tries to get the results returned by the SPARQL query from the DDWP. To get them, it will send a HTTP query to the web server for that URL.
4.b. Doing content negotiation with remote server. However, the web server wants a XML representation of the results since it is the only format it understands. This request will be done via content negotiation with the web server. It is where the shapes of the DDWP are important: depending on want the user want, results of a SPARQL query will be formatted in one of the possible shapes, depending on what the user wants (content negotiation), and will send them to him.
5. Generating the DDWP according to the content negotiated. The Dynamic Data Web Page will be generated by the web server depending on the content negotiation the two parties agreed on.
6. Sending results to the web service. Finally the results, formatted to meet the user’s needs, will be returned to the user.
What this means?
This means that data only matters. In fact, the only thing one need now is to build a good data source. Once the data source is well built (remember, the data source can be anything here, from a search engine database to the products catalogue of a company, or even the personal web page of a 14 years old geek).
From that data source, everything can be generated for each web page (URL). If the content requested is a HTML page, then the data source can generate XML, run a XSLT skin template with and then send a HTML page: just like any other web page. However, from the same data source, a semantic web crawler could request the RDF/N3 data for the same URL. Then the DDWP would send the RDF/N3 representation of the URL.
So from one data source, you can get its data the way you want.
From that point a URL (or a web page, call it the way you want) become a presentation page web, a web service, etc; All-in-one!
Everything is made simpler with examples, so there we are. All the concept of Dynamic Data Web Page is possible thanks to Virtuoso. All the examples above are using this database management system.
Okay, to illustrate the case, we will use this Google Base Jobs page for example:
The triple store will get that Google Base Jobs page, convert it into RDF and then will index the triples into the triple store. This will be the data we will try to access.
A user created a SPARQL query that will request all that data. The query look-like:
SELECT ?s ?p ?o
FROM <http://www.google.com/base/feeds/snippets/-/ jobs?start-index=30&max-results=30&key= ABQIAAAA7VerLsOcLuBYXR7vZI2NjhTRERdeAiwZ9EeJWta3L_ JZVS0bOBRIFbhTrQjhHE52fqjZvfabYYyn6A>
?s ?p ?o .
The user will save the SPARQL query on the web server in the directory “/DAV/home/demo/Public/Queries/DataWeb/” with the file name “google_base_jobs_dataspace.isparql”
The web service will assign a URL to that file:
Now the user wants to see the results of the query he just built, he can see them only by putting this URL into its web browser. Then a HTML web page will be generated and displayed so that he can easily consult it.
This is a generic html page. But what about generating XML instead of HTML and then applying a XSLT skin template to generate the HTML for the user? Yeah, you just got another way to create traditional dynamic web pages.
Step #4.a / #4.b / #5 / #6
Now what we want is showing what happen when a web service request results not in HTML but in something else like RDF/XML.
To show you how it happens, we will use the OAT RDF Browser. This is a web service that will get RDF data from somewhere on the Web and that will display it to users via a web interface.
This web service do exactly the steps 4, 5 and 6: it send a HTTP query for a URL. It will do some content negotiation with the remote web server to get RDF data, it will download the RDF data sent by the web server, consume it and display it to the user via the interface.
The result for our example is there. As you can see, from the same URL, the DDWP will send RDF/XML data instead of HTML. Then the web service will consume it, and display the same information in a different way. How different? Well, click on the Yahoo! Map Tab and you will see. You see? The same information displayed on a map that shows where the jobs are in the United-States.
Dynamic Data Web Page is not a theory. It is a reality; it is something that already exists in Virtuoso and that can be used by anyone who cares about simplifying the exchange of data between its system and other systems. It is all about Web communication. Instead of talking about language (real world) we are talking about formats (web world).
9 thoughts on “Dynamic Data Web Page”
What’s your reason for inventing this new terminology? I mean the “data shape” notion. How is that different from the “data format” or even “resource content type”?
Do you know about the Accept/Content-Type HTTP header fields? They appear to be designed for exactly the reasons you mention.
P.S. The size of this comment area is proportional to the number of commnets you get. Think about it 🙂
There is none: data shape is data formatâ€¦. the only thing is that i remembered how I like the word â€œshape-shifterâ€, donâ€™t ask me why please
And the content negotiation I am talking about is naturally what you are talking about: HTTP content negotiation! I am not re-inventing the whell here
Nah, the main idea to have in mind here is that everything start from a single saved sparql query: a sparql query get a URL, and that URL (that is the location of the saved sparql query on the web server) will return results of the query in many formats (shapes). And the most fantastic,is that it is already existing in Virtuoso
Thanks for your comment!
Quite interesting post about the decoupling of “data” and “representation”.
As you seem to know RDF and REST extensively, I have two questions :
Why to use two level of indirections, aka SPARQL AND HTTP query, asking for learning a new language (SPARQL) and generating “machine language styled” URL ?
Why not encoding IN THE URL the query with a simple GET ? With this you can be able to aggregate the “query/get”, the update/put and the create/post within the same token…
Why not giving an URL both to “raw” data, aka URL with no extension, AND to “formated” data, aka URL appended with a “.” operator ?
Thus, http://blabla.com/google can point to the noun “google”, http://…/google.html could point to the html version, and tutti quanti…
This could allow the client to choose the representation with the same explicit HTTP GET mecanism and not with some special opaque and equivoque “Content Negociation” mecanisms.
[quote post=”773″]Why to use two level of indirections, aka SPARQL AND HTTP query, asking for learning a new language (SPARQL) and generating â€œmachine language styledâ€ URL ?
Why not encoding IN THE URL the query with a simple GET ? With this you can be able to aggregate the â€œquery/getâ€, the update/put and the create/post within the same tokenâ€¦[/quote]
Well yes. Take a look at that example. Is it what you are talking about?
One of the problem I can see with queries embedded into URL is: what do we do if the QUERY change? The URL will change too and this is bad in my point of view. See this method exactly as the same with server side programming language except that instead of using a programming language such as PHP, ASP, Ruby on Rail, etc, we use a SPARQL query to generate the data structure and then some other things to present data.
Also, embedding queries in URLs is not really what I call: cool URIs or nice URLs 🙂
[quote post=”773″]Why not giving an URL both to “raw” data, aka URL with no extension, AND to “formated” data, aka URL appended with a “.” operator ?
Thus, http://blabla.com/google can point to the noun “google”, http://…/google.html could point to the html version, and tutti quanti
This could allow the client to choose the representation with the same explicit HTTP GET mecanism and not with some special opaque and equivoque “Content Negociation” mechanisms.[/quote]
Well you are right. In fact, it could be another way to get data. Content Negotiation is preferred by the semantic web community, and personally I find that a much better way to get data in different formats. However I agree that it could be more convenient to some devleoppers, and more intuitive to others. I will send the suggestion to the OpenLink team working on that stuff.
Thanks for this comment, I hope that I will answer to your question, don’t hesitate to ask for precisions or other things.
A few important points re. Fred’s post:
1. You have one URL to a Web Page that is also a Data Source URI
2. You do not have to learn SPARQL per se. You simply put URLs (usual Web Page Location references) or URIs (Pointers to Web Data Sources; which may include Web Pages) into the RDF Browser
3. Joining data from del.icio.us and flickr and googlebase, and other data sources is trivial
4. Content negotiation is the correct way of getting data from web servers the cost to the user is zero. The benefits are exponential (for instance a traditional Browser and associated industry politics no longer determines your ability to improve how you experience the Web).
Thanks for your responses.
Actually on the first point, I had not been very clear. For me the HTTP GET method IS ALREADY a Query. Thus, I don’t understand the need to layer new SQL-styled methods on top of it.
My current lonely work is actually to debunk a coherent URL semantics to bypass new methods constructions i.e how to for example distinguish “jobs” the 4-letter word, (the things that are) “jobs” and (the thing that are called) “Jobs”.
Thus, I would note / cool for a thing called “Cool”, /#cool for “A cool”, / #cool/ for all the things that are cool, or /%cool for the word cool).
[quote post=”773″]Actually on the first point, I had not been very clear. For me the HTTP GET method IS ALREADY a Query. Thus, I donâ€™t understand the need to layer new SQL-styled methods on top of it.[/quote]
Humm not sure I follow here: the idea of the SPARQL query if to get data from a RDF triple store. It is way to get results from a source of data. The HTTP GET method will get that data from the web server only. And the Accept: parameter will check for content negotiation. Am I missing something in your idea?
[quote post=”773″]My current lonely work is actually to debunk a coherent URL semantics to bypass new methods constructions i.e how to for example distinguish â€œjobsâ€ the 4-letter word, (the things that are) â€œjobsâ€ and (the thing that are called) â€œJobsâ€.
Thus, I would note / cool for a thing called â€œCoolâ€, /#cool for â€œA coolâ€, / #cool/ for all the things that are cool, or /%cool for the word cool).[/quote]
So if I understand right, you are developing a query language from the URI syntax? Than parse it with your web server to build some server side query to send to some data store?
There’s something like a conspiracy at hand! After lunch I was reading through Dojo’s “Summer of Code / Data Projects” and came across “Write a dojo.data datastore implementation that uses MySQL as a server-side database. Implement the datastore so that it can deal with loosely-typed, multi-valued, semi-structured, JSON-ish, web 2.0 data, and can store that data in a rigid ugly old relational database”.
(Less substantial, but spookier: Dudu mentioned your name this morning, and just now, reading Leo Sauermann’s blog as I wrote a reply to him, I found a link here; wheels within wheels!)