The current problem of the Web
The current problem of the Web is that most (virtually all) documents it holds are formatted for humans. By example, HTML is a markup language that is used to present information to humans, to make documents easily understandable by them.
You wonder why I say that it is the current problem of the web? The problem resides in the fact that these human-oriented documents are not easily processable by computers. The information is not formatted for them. They can’t easily understand what a document is about, his subject, his meaning, his semantic, etc.
A possible solution to that problem
A solution we could use to try to solve this problem is annotating these human-processable documents with computer-processable metadata. This is the exact purpose of new sort of file formats like RDF or OWL. The primary and only purpose of these new file formats is to make digital documents (file, photo, video, anything that is digital) computer-processable.
Such document would describe the meaning and the semantic of a digital document that could be easily understood by computers. That way, software agents could easily read these documents, understand them and even infer new facts and knowledge from them. This is the idea behind the Semantic Web.
The possible problems with such annotated metadata
Remember the first time of the Web when people were using metadata in their HTML header files? Remember the time when search bots were using this information to return relevant data to users? Remember the time when search bots stop using them because people were only using them to tricks the search bots to bring people to their web pages even if their search queries where really not relevant with the content of the returned web page results? It is exactly why people lose faith in metadata. And it is exactly why I have doubts in social tagging (but this is another story).
The problem with the early principles of annotating metadata to documents is that people were able to annotate their web documents with any metadata information, related or not with the content of these documents. At the end, web publishers were not annotating their documents with relevant information in relation with their content, but only with information that would bring traffic to their web sites.
You are probably thinking something like this: “Fred, you said that the semantic web formats: RDF, OWL, or any other, are simply sort of metadata files that could be annotated to current web documents to describe them, their meaning and semantic. So, don’t you think that the result would be the same as the HTML headers’ metadata? That people would try to tricks the semantic web search engines, crawlers and software agents?”
The solution: Semantic-Web-Of-Trust
Bellow is a short description of the Web of Trust saw by Tim Berners-Lee, the father of the Web and the Semantic Web, wrote in 1997.
“In cases in which a high level of trust is needed for metadata, digitally signed metadata will allow the Web to include a “Web of trust”. The Web of trust will be a set of documents on the Web that are digitally signed with certain keys and contain statements about those keys and bout other documents. Like the Web itself, the Web of trust will not need to have a specific structure, such as a tree or a matrix. Statements of trust can be added in such a way as to reflect actual trust exactly. People learn to trust through experience and though recommendation. We change our minds about who we trust and for what purposes. The Web of trust must allow us to express this.”
At that time, Mr. Berners-Lee saw digital signatures as a way to ensure who the author of a metadata annotation is to add trust in that metadata. Some people could also think about PGP’s [PKI] Web of trust system.
Other people, like Shelley Powers, thought about annotating RDF content to links (by example, annotating descriptive information about a link to a local hardware store), and using reification principles to infer trust in the relation: I trust him, you trust me, so you trust him.
Many studies are done to try to find what is the best way to add trust to the Web and in a near future, the Semantic Web. Some techniques, like PGP’s are tested and effective. However, could they be applied for the Semantic Web? What is the best system we can use for the Semantic Web? Is the system already created? Is it to be created?
One thing is sure is that such a system will have to be present in the Semantic Web if we want it to succeed.