Archive for March, 2006

TalkDigger … towards community interaction

TalkDigger is quietly evolving toward something new. I have been thinking for some time now about how Talk Digger might evolve and how the “conversations evolving on the Internet” paradigm could be extended. I have had many ideas in the past months, some better and bigger than others.

Then I told myself: Fred, you have to start somewhere.

This is obvious; but there’s a significant process behind that obvious affirmation.

Here’s what I have done since I came back from India: I started somewhere. I started to develop a new feature (in fact it is a whole new infrastructure), layered above which I will be able to build other things. The process is quite simple; think small first, and extend features and functionality later.

Easy to say, and it’s what I’m doing … but the result is not that simple.

So now, which direction is Talk Digger taking?

the vision is always the same: helping people find, follow and enter discussions which are evolving on the Internet. With the current capabilities of Talk Digger, a user can find discussions somewhat easily. The results are sometimes good, and at other times not so good. The presentation is normal and somewhat intuitive, as it mimics the user interface of traditional search engines.

Talk Digger also generate RSS feeds with search results. This feature is really interesting because you will receive new results [new discussion's items] directly in your web feed reader. However, the interaction between web-feed-readers / Talk Digger / search-engines is not that good. Some results are marked as new day after day, etc. It is a good start but it is not enough for me, and certainly not for Talk Digger users.

As you know, Talk Digger is not that bad, but there’s definitely room (and the need) to upgrade it. One of the core problems is that all the requests are processed in real time. It is good for me, because I can easily host Talk Digger on low cost web servers. However, this advantage is not that good for the users. It was a first step; now I have to go forward with the next steps.

So, what are those next steps? Archiving data, and then with that data I will be able to develop ideas that are impossible without the stored archived data.

This is the first next step: developing a Talk Digger Crawler, gathering data, archiving and updating it.

After that I intend to create a community infrastructure, in order to:

  • Help people to define themselves by their works, their interests, and their relations with other people.
  • Help people to find someone that they could connect with.
  • Help people to get connected and communicate with other community members.

I also to create an upgraded infrastructure to track and define conversations, so as to:

  • Help people to easily track conversations.
  • Help people to find interesting conversations.

Finally, I plan to create an infrastructure that lets people easily enter into a discussion they have found using Talk Digger.

That is it.

With these new features, Talk Digger will go a step further into the direction of its driving vision: helping people to find, follow and enter into conversations which are evolving on / in the Internet.

Some people may think that I am trying to shovel clouds, but I am not. The development of this “new system” is on the way. In about 2 months I should start to distribute private access to some people to test the new system. That way in about 3 months or so, if everything goes fine, I should publicly release this new version.

Naturally I will keep you in touch with the developments! If anything crosses your mind while reading this post, please leave a comment below… who knows, it certainly might be helpful to me!

Technorati: | | | | | | | |

Semantic MediaWiki

I just discovered this Semantic initiative by Markus Krötzsch while reading Danny Ayers.

Wow! What a great initiative – trying to develop and test some Semantic Web ideas and technologies in the mainstream.

What is this Semantic MediaWiki all about?

The WikiProject “Semantic MediaWiki” provides a common platform for discussing extensions of the MediaWiki software that allow for simple, machine-based processing of Wiki content. This usually requires some form of “semantic annotation,” but the special Wiki environment and the multitude of envisaged applications impose a number of additional requirements.

The overall objective of the project is to develop a single solution for semantic annotation that fits the needs of most Wikimedia projects and still meets the Wiki-specific requirements of usability and performance. It is understood that ad hoc implementations (i.e. “hacks”) may sometimes solve single problems, but agreeing on common editing syntax, underlying technology, exchange formats, etc. bears huge advantages for all participants.

You can read the rest to find out what Markus and his team want to do (and not do) with the Semantic MediaWiki initiative.

I didn’t have much time to check its under-the-hood construction because I am overloaded with the development of Talk Digger and another contract I have accepted. However I did want to take the time to write a bit about it and the cool things I found.

The project is really great: they are trying to build a simple and easy to use semantic system, where users would benefit from its power without caring about all the technical stuff. They have well defined goals to guide their vision of this MediaWiki add-on. This Semantic MediaWiki is intended for Mr. and Mrs. anybody, not only academics.

How will semantic web capabilities be received by Internet users? How will they use the capabilities? Will they like the new possibilities it gives them?

These are all open questions that such an initiative will help to answer.

Give it a try

Basically it defines relations between the objects of Wiki articles. These things could be a word, a group of words (referred as value in their documentation) or another article. Read the document to know what the possible relations are, how they work, and how they are defined.

Check out what a Wiki article about San Diego looks like in a Semantic Web environment. The semantic meanings of the main terms are defined directly into the article. These definitions create a RDFS graph that describes the semantic meanings of the article.

Yeah well, what does this change in my life?

There are two possible answers: (1) absolutely nothing, or (2) absolutely everything. Think about it – think about Wikipedia using that add-on in their system, such that people start to define the semantic meanings of its articles.

People would have access to the same content; however now that content would be accessible in a new way: computers would have access to the RDFS graph created with the relations defined by Wikipedia users. So a third party crawler could crawl Wikipedia, then check for this tag in each article:

<link rel=”alternate” type=”application/rdf+xml” title=”…” href=”…” />

Then the crawler would download and archive the RDFS/SMW document annotated to each article. Eventually a RDFS/(SMW?) reasonner like KAON could do marvel with all that meaningful data!

Finally I would suggest trying the Simple Semantic Search.

I will try to find more time later to check out this Semantic Wiki more in depth and to talk a little bit more about it.

Technorati: | | | |

Larry Ellison

 
“Dont assume they are right just because they are in authority or because they are experts.. Think things out for yourself. Come to your own judgement.”

– Larry Ellison

 

Mesh – Canada’s Web 2.0 conference

Yup, we finally got one! Thanks to Mark Evans, Mathew Ingram, Mike McDerment, Rob Hyndman and Stuart MacDonald for this initiative.

I will try all my best to be there; I should register in a couple of days.

Mesh will take place in Toronto at the MaRS Collaboration Centre the 15th and 16th May 2006.

Many people will be there like Steve Rubel, Tara Hunt, Om Malik, Jason Fried, Stowe Boyd, Amber MacArthur, and many others.

The agenda is not fixed at the moment, but Stuart will put it on their blog as soon as it is nailed down.

For the attendees, you should use the tag Mesh06 when blogging about that event.

Who are currently talking about that event? Check out the Technorati Tags; however, at the moment you can find more results using Talk Digger.

I just found via the Remarkk blog that there is a BarCamp in Toronto the 13th and 14th May 2006, just before the conference. It could be a good warm-up, and it could be a good place to present a new major feature of Talk Digger (in fact it is far more ambitious than the meta-search engines I created with Talk Digger, and in my humble opinion, it have far more potential to connect people and find/follow/enter into discussions evolving on the Web). The only thing I hope is being able to finish a demo by the 12 May; otherwise I could simply talk about Web discussions.

Technorati: | | | | | | |

Answering to: The One Crucial Idea of Web 2.0

With this blog post I’m answering Joshua Porter regarding one of his most recent blog posts. To fully appreciate or understand my response, you should read his before continuing to read this post.

What is Web 2.0? Personally I do not agree that Web 2.0 is defined, as widely accepted, by the new social Web services trend that relies on a community to define and “dig” the Web. I would call that Web the “Web 1.5″: a new crucial step toward something much bigger: the Semantic Web; my view of the Web 2.0.

In reality, the new social Web services like Digg, Flickr, Del.icio.us, etc. are not new technologies. These services use old, well-understood methods and technologies. I think that the crucial factor that makes them spread like voracious mushrooms is the drastic decline of the price of their supporting infrastructure: cheap broadband, good open source (and free) developing technologies like MySQL, PHP (no licensing costs) and gigabytes of hard drive space for pennies. This is a form of convergence, not a new Web.

Mr. Porter wrote:

If there is one idea that encapsulates what Web 2.0 is about, one idea that wasn’t a factor before but is a factor now, it’s the idea of leveraging the network to uncover the Wisdom of Crowds. Forget Ajax, APIs, and other technologies for a second. The big challenge is aggregating whatever tidbits of digitally-recorded behavior we can find, making some sense of it algorithmically, and then uncovering the wisdom of crowds through a clear and easy interface to it.

It is all about popularity; it is all about Google Pagerank. But it is one tool amongst many others.

Google offers good services. Google changed the landscape in the search industry. The problem is that I can always spend 1 hour finding something on the Web, and yet what I find is often basically unacceptable.

To upgrade the Web, we should see a breakthrough that drastically upgrades its efficiency. Unfortunately Digg or Technorati have never helped me to decrease search time. They are cool services, but they don’t answer that particular need.

To put the tag 2.0 on the Web, we should see such a breakthrough. It is why I would call the emerging social trend 1.5: a good step forward, but not enough to change the first number of the version.

I have some questions for people who think that the current emerging “Web 2.0″ is a major breakthrough for the Web:

1- What happens if the “crowd” does not find the golden piece of information I am searching for because it is buried too deeply in the Web and nobody noticed it before?

2- Did anyone see an article written on the Canadian government that offers tricks to complete your income taxes form popping-up on Digg?

The problem I see with this method is that something has to be flagged by many, many people to pop-up to the surface – *something* has to be useful to many people that will dig it, link to it, etc. And personally I find useful information all day long, but I don’t or won’t link to that useful information.

I do not want to have the references to resources that meets the needs of *everybody on the Web*; I want to have the references to resources that fill MY needs.

The only time that such methods are really useful is when my needs meet those of the majority. That is often the case when we talk about general information. However it just doesn’t work when I start to search for up-to-date and specific information about an obscure subject, a subject that few people care about, or even more important, a subject about which information has to be inferred in order to be discovered!

What is happening with these new services like Digg, Flickr or Del.icio.us, started with Google’s Pagerank idea, is good and really cool, but I hope this is not a end-point for the next 10 years – otherwise we will miss focusing on something much more useful and important.

And the evidence is mounting. Today, Richard MacManus writes of the new features on Rojo, and in explaining what they are Chris Alden tells Richard that they’re emulating Pagerank:

“How do we do it? (determine relevance) Generally, just like Google used link metadata to determine relevance of search results, there is a fair amount of metadata we can use to infer relevance, including how many people are reading, tagging, and voting for a story, how popular the feed is – both to you personally, to your contacts, and to all readers, as well as things like link data and content analysis. “

When I read this, I think about the Semantic Web: a way to create metadata on resources not to infer relevance, but to infer Knowledge. Relevance is good, at least in some scenarios, but Knowledge is better because it is good in all scenarios. Remember: Knowledge is power.

The problem is that people think about inferring relevance in terms of popularity, people linking and talking about something, and not in term of Knowledge.

I sincerely hope that people will start to talk about the Web 2.0 as a web of Knowledge, a Web of *Semantic usage*. As I said, I would refer as the social Web to the Web 1.5: a first step, a first non-academic and widespread experience toward the Web 2.0: the Web of Knowledge, the Web where you do not lose 1 hour of your precious time searching for something trivial but unfortunately not popular.

Technorati: | | | | | | | | | | | |

Vast: a model for the Semantic Web

In the past I talked a lot about the future of the web, the Semantic Web, and what developers and businesses have to do to make it happen.

Many people are currently blogging about a new search engine called Vast. I took a quick look at it and found an impressive service. At the moment, it has semantic capabilities. It’s a normal search engine that crawls the web to search for specific data: cars, jobs, and profiles. It broadcasts this information using a REST interface and no semantic relations between the results are available.

So, why do I say that it is a model for the Semantic Web if it has nothing to do with semantics? Because it has a crucial characteristic needed by most of semantic web services to make the idea of the Semantic Web work.

I have already talked about it in this blog post: sharing its content and computations freely, to anyone who needs it, without any restrictions.

This is exactly what Vast is doing:

“Use the Vast Dataset to Build and Augment Your Own Services – it’s free, open, and available for commercial and non-commercial uses!”

“Vast’s entire dataset is available for you to add to your site, blog, or service. You can rebuild all of Vast.com, if you’d like, offer targeted classified search results to your users, build visualizations or mappings, or process the data to find interesting correlations.”

So, you have an idea, and you need their data to develop it – what will you do? Take the data from their web servers, without any restriction.

The only question you need to ask yourself is: do I trust their reliability to develop my project using their free service? The answer is up to you, and it’s the sort of question developers must repeatedly ask themselves – if a new environment consisting of such web services is emerging (and I think it is).

I also questioned myself on what could be the business model of such a Web service? They have a part of the answer:

“At some point, we will accept payment for advertising embedded in our feeds. At that time, the advertising revenues will of course be shared with developers and partners using our feeds.”

“When Vast decides to embed sponsored links or advertisements in the dataset, you must display these links or ads with prominence alongside or as part of the data. However, it is our plan that you will receive a share of the revenues that you help generate for doing so.”

As simple as crying rabbit. They embedded some sort of sponsored results in their results listing and they force you to display them with their terms of service.

However, who cares? I mean, I have no problem with ads as long as they are relevant to what I am searching for. If their service suggests to me a car found somewhere on the Web or a car result sponsored by someone else, as long as I have a car that fill my needs I do not care where it came from and it is exactly what Vast is doing.

I can only say one thing at this point: congratulations guys for making all your data freely available to anybody, and for having built a viable business model (in my humble opinion) over it.

Technorati: | | | | | | |

Memetracking and Web Feed Reading

I read this post a couple of days ago when I was trying to cope with all the things that happened in the Blogsphere while I was traveling. This is a really well written and insightful post wrote by Robert Scoble about Memetracker vs. Web Feed Readers.

[...]

I miss my RSS reading. Reading RSS makes me smarter, not snarkier. Why? Cause I choose who I’m going to read. Pick smart people to read and you’ll get smarter.

Hint, the smartest people in my RSS are usually the least snarky. Why? Cause they could give a f**k about all the traffic.

[...]

I totally agree with Robert on this one, and it is probably a reason why I do not give much importance to memetrackers and that I only subscribe to their RSS feeds: I give them the same importance as any other bloggers.

However, memetrackers and blog search engines have the same problem: when you try to discover new blogs and new articles that may be of interest to you, you always get the same people and the same blog posts.

Unpopular bloggers have really good ideas. However, nobody finds them because they are not popular and they are not popular because they don’t give care at all about being popular.

The problem is that all these services generally use some sort of ranking system; the type of system popularized by Google. However ranking systems are not built to show you the best results, they show you the most popular results with the premise that they are the best; but they rarely are not. So, now – how can I find these bright people? How can I read their awesome ideas?

That’s what I want: I want something that helps me manage the information in such a way that it will aggregate information that may be of interest or use to me, and not necessarily the information that for whatever reason is of interest to the rest of the planet.

Yeah right … I am dreaming in technicolour … and I know that many people have been working on that problem for ages; however, I’m impatient and I can’t wait to see a real breakthrough as it unfolds in front of the general public.

During that time, I want to connect with and talk to people that have the same or similar interests as I do, rather than spending hours weekly trying to find these people using the current services available on the Web.

Technorati: | | | | | | | | |




This blog is a regularly updated collection of my thoughts, tips, tricks and ideas about my semantic Web researches and related software development.


RSS


Follow

Get every new post on this blog delivered to your Inbox.

Join 10 other followers:

Or subscribe to the RSS feed by clicking on the counter:

RSS