Talk Digger and Ping the Semantic Web became ZitGist

I said that I would write about what is happening with Talk Digger and Ping the Semantic Web in the last month, so I am now taking some time to start to write the story.

In September Kingsley Idehen, CEO of OpenLink Software Inc., contacted me to talk about my projects, the database management system developed by OpenLink (called Virtuoso) and the Semantic Web.

Our talks leaded us in a direction that I unanticipated: we started to talk about creating a company that would develop both Talk Digger and Ping the Semantic Web projects. So far these two projects were prototypes I was developing to test my ideas, to help the adoption of the semantic web and to learn.

Creating a company in partnership with OpenLink would give me the possibility to get the resources to develop these two projects in a professional way: giving me the time, the computer infrastructure and the human resources to develop, extend, refine and enhance these two services. All that for the benefit of my users: to enhance their experience with the systems.

After one month of discussion we created a company called ZitGist (pronounced: Zeitgist) that would own and develop both Talkdigger and Ping the Semantic Web. The legal entity is now created, but much work have do be done in the next weeks (releasing the website and logo, publishing the official press release, etc).

Both OpenLink Software Inc. and me are members of ZitGist. However , I didn’t closed the deal only to have financial resources to develop my projects. In fact, a big part of OpenLink’s investments in the project is their Virtuoso DBMS. This database system will replace the current one used in both projects (MySQL) and will increase their capabilities in many ways. I will write about the integration of Virtuoso in both systems later, but I can guarantee you that the decision goes in the mission I gave me more than one month ago:

This vision is drove by a personal goal: make the semantic web a reality. This is ambitious and probably arrogant: I know. “Who dares win” a SAS motto says. It is what I will do: dare.

Do I have a chance to reach my goal? I hope so, but I have no idea. The only thing I know is that it will be a reality only if everybody tries to do a little thing in that direction; there is the little things I will try do to:

  • Make Talk Digger results computer processable
  • Develop semantic web applications that will interact with the Talk Digger system
  • Write about the subject in such a way that any Internet users will understand
  • Educate people to this future reality through writings and oral presentations

This vision drove my last year and there is where I am. The implementation of Virtuoso in both Talk Digger and Ping the Semantic Web, the creation of ZitGist and my partnership with OpenLink have been took accordingly to that vision.

In the next days I’ll write more about ZitGist, the new vision of Talk Digger and Ping the Semantic Web, the deployment of Virtuoso and the new possibilities (features) it will enable.

Technorati: | | | | | | | | | |

Talk Digger and Internet Explorer 7.0

Yesterday I installed the new version of Internet Explorer (7.0) and started to test it. I found some user interface bugs in Talk Digger. They were probably caused by all the changes they have made in the way they manipulate DOM documents, JavaScripts, etc.

It is always the same thing when a new version of a browser is released: you have to check if everything always work fine on your Web site, if it is not, you have to fix it and in the worse case, developing custom code that will handle this special case for this specific web browser.

If you find any other problems, errors or glitches, please send a bug report via the Bug Report page.

Technorati: | | | | | |

Talk Digger now support any language characters

 

   

Recently Talk Digger has been featured by a big Japanese blog named 100shiki and Internet.com Japan. It brought a lot of new users into the system. However, most of these users were not creating their profiles or they were not searching using English words. Most of them were interacting with the system in Japanese.

 

Talk Digger’s demography

In fact, many non-occidental people are using the system (around 50%): Japanese, Chinese, Middle East people, Taiwanese, Russian, etc.

I had to make sure that they could interact with Talk Digger in their own language without being frustrated by bugs related to their language characters.

   

 

Handling well the UTF-8 charset

I have done what I should have done before: making sure that all the characters manipulated by the system are encoded in UTF-8. What it means? That all the underlying systems had the UTf-8 charset as default, that all the functions I developed were manipulating UTF-8, that all the URLs I was playing with (Ajax) were also encoded in UTF-8, that all the data I had in the database was in UTF-8, etc, etc, etc. I would say that 90% of the system was right, but the remaining 10% was frustrating the user experience.

So I took the bull by the horns and I fixed everything.

Now you should be able to write anything, anywhere, in any language, and the system should support it without any problems.

You should be able to use a non-alphanumeric username, you should be able to write your password in Chinese, you should be able to search conversations, comments, etc. in Japanese or Russian, etc.

 

The next step

Seeing that about 50% of Talk Digger users are non-occidental people told me that I had to do something about it. It was the first step, now the next step is to create a multi-language version of the service. That way, even if a user doesn’t speak English, he will be able to interact with the interface in his own language.

 

Bugs

Please, report any bugs that you encounter related to that issue as soon as possible. Everything should work just fine, but we never know.

Technorati: | | | | | | | | charset

Implementing the SIOC v1.08 ontology into Talk Digger

 

Many months ago I choose to export Talk Digger’s entire dataset (and relations between that data) using RDF. At that time I had to choose some ontologies to use that would best fit to explicit relationships (semantics) between Talk Digger content. This is why I choose to use the FOAF and the SIOC ontologies. I needed to explicit the relationship between Talk Digger users (FOAF) and I needed to explicit the relationship between conversations (SIOC) and finally I needed to explicit the relationship between both users and conversations (SIOC and FOAF).

This document is about the implementation of the version 1.08 of the SIOC ontology, about its relation with FOAF documents, and finally it is about the semantic web as well.

 

New version of the SIOC ontology: v1.08

To create a good ontology you have to process by iteration: refining the ontology with testing and peer reviews.

Implementing an existing ontology in a system (such as Talk Digger) also has that process: generating a RDF file accordingly to the ontology and then trying to figure out how to link everything together (defining URI classes, defining resources, linking to resources, etc.) to optimize the graph (optimizing the relations betweens the resources to have as much meaning (easy to query) as possible).

This is also what happened since the last time I implemented the SIOC ontology into Talk Digger (about 4 months ago). Many changes have been made with the ontology since my last implementation, and it is the reason why I am re-implementing the ontology into the system and that I am refreshing this documentation.

 

Mapping SIOC classes and properties to Talk Digger functionalities

The first step is to map the SIOC ontology terms to the Talk Digger’s web site entities (functionalities, concepts, etc). The schemas bellow explicit the mapping I have done between the ontology and Talk Digger. At the left you have the SIOC ontology classes and properties (I only put the properties that create relations between classes. Properties like sioc:topic, sioc:description, etc. are not on that schemas to make it clearer). At the left you have the Talk Digger system. In the middle you have the relations between the SIOC ontology classes and Talk Digger entities.

 

 

[Click on the schemas for the Full View]

 

Description of the schemas

  • The sioc:Site instance is Talk Digger’s web site (talkdigger.com)
  • A sioc:Forum is a Talk Digger conversation page. I consider that a conversation page is a forum. Each time that a new URL is tracked by Talk Digger, then a new “forum” is also created. Forums are interlinked together, so if a url A and B are tracked by the system and that the web document at the url B links to the url A we will have: [sioc:Forum(A ) — sioc: parent_of –> sioc:Forum(B )] AND [sioc:Forum(B ) — sioc:has_parent –> sioc:Forum(A)]
  • A sioc: Post is a comment wrote by a Talk Digger user on a conversation page. So each time a user write a comment, a sioc: Post is created in the sioc:Forum.
  • A sioc:Topic is a tag used to track a conversation. Each time a user start tracking a conversation on Talk Digger, he has the possibility to tag it with some keywords. So each time a tag is used to describe a conversation, a sioc:Topic is created to describe the sioc:Forum and sioc: Post topics.
  • A sioc:User is a Talk Digger user. A Talk Digger user is defined by his unique username. The personal description of the sioc:User is related (via the rdfs:seeAlso property) to it’s FOAF profile (archived in the Talk Digger System).
  • Each time a conversation page is created in the system, a related sioc:Usergroup is also created. Each time a user start to track a conversation using Talk Digger, it also subscribe to the related sioc:Usergroup. So: [sioc:User(A) — sioc:member_of –> sioc:Usergroup(conversation)]

 

Relations between conversations

Two sioc:Forum can be linked together if a url A and B are tracked by Talk Digger and that the web document at the url B links to the url A.

But what happen if the url A links to the url B too?

 

 

There is a circular loop in the model: both sioc:Forum are childs and parents.

In the context of Talk Digger, it tells us that A is part of the conversation started by B and B is also part of the conversation started by A. We could probably infer that A and B belongs to a set and that that set is the conversation.

 

sioc:reply_of and sioc:has_reply to recreate the course of events

The sioc:reply_to and sioc:has_reply of the sioc: Post class are really great in the context of Talk Digger (and blog comments) because systems will be able re-create the course of events, without needing dates, only by following the graph created by these relations.

 

 

 

Implementation using RDF/XML

Now that the mapping between the system (Talk Digger) and the ontology (SIOC) is done, what we have to do is to implement the ontology using RDF serialized in XML. What it means? It means that Talk Digger will export its dataset in RDF/XML according to the SIOC (and FOAF) ontology.

 

Implementation procedure using IsaViz

The tool I used to implement the SIOC and the FOAF ontologies in Talk Digger is a RDF editor/visualization tool called IsaViz.

The procedure was simple:

  1. Generating the RDF/XML files (accordingly to SIOC and FOAF) with Talk Digger’s content database.
  2. Importing the RDF/XML file in IsaViz.
  3. Visualizing and analyzing the resulting graphs.
  4. Checking all the relations between the resources and trying to figure out if it was possible to cut/add some of them to simplify/optimize the resulting graph.
  5. Checking all the anonymous nodes (bNodes) of the graph and checking if it was possible to relate them to an existing resource.
  6. Performing these 5 steps until I was satisfied by the resulting graph.

 

Playing with URIs and xml:base

What is great is that I can distribute Talk Digger’s content from anywhere on the Web (with different URLs) and a crawler can download all these snipped of content (FOAF profiles, conversations content and relationships, etc.), aggregate them and merge them in a unique RDF graph. That way they can have their hands on all the relations that exist in the Talk Digger and then querying it (the RDF graph) in useful and meaningful ways.

All that magic is possible by the fact that we can define a different URI for a given RDF/XML document using the xml:base attribute. That way I can:

  • Host a RDF/XML document at the URL http://talkdigger.com.com/a
  • Define the xml:base with the URI “http://talkdigger.com.com/db/”
  • Host a RDF/XML document at the URL http://talkdigger.com.com/b
  • Also Defining the xml:base with the URI “http://talkdigger.com.com/db/”

Then if a crawler downloads both RDF documents “a” and “b”, it can merge them to recreate the single RDF document defined at “http://talkdigger.com.com/db/”. By example, this merged RDF document would be the graph of all relations defined in Talk Digger.

 

Talk Digger’s URI classes

I refer to a “URI class” when I talk about a “part” of a URI that is general to many URI “instances”. I refer to an “URI instance” when I talk about a URI that refers to a resource.

By example, the “URI class” of Talk Digger users is:

http://www.talkdigger.com/users/

But an “instance” of that “URI class” would be the URI that describe a particular Talk Digger user:

http://www.talkdigger.com/users/fgiasson

In that example, this “instance” refers to a resource that is the Talk Digger subscribed user called “fgiasson”.

There is the list of “URI classes” defined in Talk Digger:

  • URI class referring to a conversation container (work as a container for the components of a conversation)

http://www.talkdigger.com/conversations/[url]

  • URI class referring to a conversation

http://www.talkdigger.com/conversations/[url]#conversation

  • URI class referring to a comment in a conversation

http://www.talkdigger.com/conversations/[url]#comment-x

  • URI class referring to a usergroup (a group of users tracking that conversation)

http://www.talkdigger.com/conversations/[url]#usergroup

  • URI class referring to a subscribed user

http://www.talkdigger.com/users/[username]

 

Visualizing relationship between Talk Digger users and conversations

It is now the time to visualize what is going on. What we will do is importing and merging some SIOC and FOAF documents into IsaViz directly from talkdigger.com

The example will be performed using two SIOC document files, and one FOAF document file.

 

Step 1

The first step is to get a conversation tracked by Talk Digger and to visualize it into IsaViz.

  1. Open IsaViz
  2. Select the “IsaViz RDF Editor”, click on the menu [File-> Import -> Replace -> RDF/XML from url…]
  3. Copy this url into the box that appeared: http://www.talkdigger.com/sioc/grazr.com
  4. Press enter

Now you can visualize the relationships of the conversation about Grazr.

Take a special attention to these following resources:

  • http://www.talkdigger.com/conversations/grazr.com#conversation
  • http://www.talkdigger.com/conversations/grazr.com#usergroup
  • http://www.talkdigger.com/users/fgiasson
  • http://www.talkdigger.com/foaf/fgiasson

Check how these resources are related to other resources (what are the properties that describe them).

 

Step2

Now it is the time to add some stuff in that graph. What we will do is merging the SIOC document of another conversation that is “talking about” this conversation.

  1. Select the “IsaViz RDF Editor”, click on the menu [File-> Import -> Merge -> RDF/XML from url…]
  2. Copy this url into the box that appeared: http://www.talkdigger.com/sioc/blog.grazr.com
  3. Press Enter

Now you can visualize the relationships between two conversations: Grazr and Grazr’s blog.

Take a special attention to these following resources:

  • http://www.talkdigger.com/conversations/grazr.com#conversation
  • http://www.talkdigger.com/conversations/blog.grazr.com#conversation

Check how these two resources are related together (the “blog.grazr.com” conversation is talking about the “grazr.com” conversation, so “blog.grazr.com” has a “parent” relation with “grazr.com”.

 

Step 3

Now it is the time to merge a FOAF document to that graph. That way, we will have more information about the user (fgiasson) that is interacting into these conversations.

  1. Select the “IsaViz RDF Editor”, click on the menu [File-> Import -> Merge -> RDF/XML from url…]
  2. Copy this url into the box that appeared: http://www.talkdigger.com/foaf/fgiasson
  3. Press Enter

Take a special attention to these following resources:

  • http://www.talkdigger.com/users/fgiasson
  • http://www.talkdigger.com/foaf/fgiasson

Check how a User (a person defined by his FOAF profile) is in relationship with his User Account (a user account on Talk Digger defined by a SIOC document).

 

Extending this method to any Talk Digger conservations

Above I explained how to visualize two conversations and a user profile using IsaViz. You have to know that this method can be use to visualize any conversations know by Talk Digger.

You only have to follow the same steps as I described above with other documents. If you check at the bottom of any web page of Talk Digger, you will see a “Semantic Web Ready” logo. At the right of this logo, you will have some icons that link to RDF documents available from that web page. So you only have to click on them, copy the URL of the document, and import it in IsaViz.

 

The big picture

All this belongs to a bigger schema. A couple of years ago, the Semantic Web was looking good on paper; now it is starting to look good on the Web.

As you can see in the schema bellow, RDF documents, SIOC and FOAF ontologies are just some stones belonging to the Semantic Web. The schema bellow is not the Semantic Web; it is a small portion of it; this is an example of how it is all working together: this is a sort of Semantic Web Mashup.

 

 

As described in one of my last blog post: Semantic Radar for FireFox and the Semantic Web Services environment, an infrastructure supporting the ideas behind the Semantic Web is starting to emerge.

The implementation of the SIOC ontology in Talk Digger is only a small step. Another small step is the development of the Ping the Semantic Web web service that aggregate and export lists of RDF documents to other web services and software agents. Other steps consist of the development of RDF data exporters like SIOC plug-ins for blogging systems, browser plug-ins like the Semantic Radar, etc.

 

The final word

In a recent discussion I had with Daniel Lemire, he wrote:

“Here is where we disagree:

“Everything is changing, and everything should explode… soon!”

I honestly do not see the Semantic Web being about to take off.”

Then I answered:

“So, will the semantic web explode in the next 5 years? My intuition tells me yes. Do I have a 6th sense like mothers? No. So, what the future reserve us? I hope it will be the semantic web (beware, I never said that we will be able to infer trusts, deploy search system with the power of Google, resolve the problem of the evolution of ontologies (versioning, etc), etc, etc, etc.) But I think that in 5 years from now, we will have enough data, knowledge, and services (that use that data) to say that we can do something useful (saving time, etc) so that we will be able to say: the semantic web is now living.”

I hope that I will be right and that Daniel will be wrong. I have the intuitive that Daniel hopes the same thing.

Technorati: | | | | | | | | | | | | |

New Talk Digger features: follow the activity of your social network

 

I just released two new features that enhance the binding between a Talk Digger user and its friends. Users are now able to easily follow which conversation their friends are tracking and what they have to say about them.

 

 

If you would like to test this new feature, you only have to follow these five steps:

  1. Creating a user account (if it is not already done).
  2. Searching for friends on Talk Digger and including them in your friends list. A good way to search friends is to check who are tracking the conversations that interest you; to search for users that have the same interests as you; to check the friends of you friends by clicking on the “Friends” tab of your friend’s user page.
  3. Add them to your friends list.
  4. Check your profile page.
  5. Click on the “Friends’ Conversations” or the “Friends’ Comments” to see which conversations are tracked by your friends and what you friends have to say.

 

Bellow is the screenshot of the conversations tracked by my friends:

 

 

And bellow is the screenshot of the comments wrote by my friends on Talk Digger

 

 

The theory behind these two new features

The theory is simple: you will probably be interested in conversations tracked by your friends because you probably have the same interests in life.

 

Following conversations of interest in a social group

Now people can follow what a group of people are talking about, the conversations that are interesting them.

Personally I have interests in the Internet, the semantic web, books and writing. So I’ll search for people with the same interests, I’ll add them in my friends list, and then I’ll start to check what these people are tracking, and what they have to say vis-à-vis some conversations.

That way, I created a sort of “virtual” group of interests and then I’ll start to check what this group is interested in.

 

The next step: creating an infrastructure to handle these groups of interest

The next step will probably be to create features to handle these groups of interests. Talk Digger is not just about tracking conversations, it is also about linking people together and create an infrastructure to manage interest groups would be a natural add-on.

Technorati: | | | | | |