Archive for March, 2006

TalkDigger … towards community interaction

Print This Post Print This Post

TalkDigger is quietly evolving toward something new. I have been thinking for some time now about how Talk Digger might evolve and how the “conversations evolving on the Internet” paradigm could be extended. I have had many ideas in the past months, some better and bigger than others.

Then I told myself: Fred, you have to start somewhere.

This is obvious; but there’s a significant process behind that obvious affirmation.

Here’s what I have done since I came back from India: I started somewhere. I started to develop a new feature (in fact it is a whole new infrastructure), layered above which I will be able to build other things. The process is quite simple; think small first, and extend features and functionality later.

Easy to say, and it’s what I’m doing … but the result is not that simple.

So now, which direction is Talk Digger taking?

the vision is always the same: helping people find, follow and enter discussions which are evolving on the Internet. With the current capabilities of Talk Digger, a user can find discussions somewhat easily. The results are sometimes good, and at other times not so good. The presentation is normal and somewhat intuitive, as it mimics the user interface of traditional search engines.

Talk Digger also generate RSS feeds with search results. This feature is really interesting because you will receive new results [new discussion's items] directly in your web feed reader. However, the interaction between web-feed-readers / Talk Digger / search-engines is not that good. Some results are marked as new day after day, etc. It is a good start but it is not enough for me, and certainly not for Talk Digger users.

As you know, Talk Digger is not that bad, but there’s definitely room (and the need) to upgrade it. One of the core problems is that all the requests are processed in real time. It is good for me, because I can easily host Talk Digger on low cost web servers. However, this advantage is not that good for the users. It was a first step; now I have to go forward with the next steps.

So, what are those next steps? Archiving data, and then with that data I will be able to develop ideas that are impossible without the stored archived data.

This is the first next step: developing a Talk Digger Crawler, gathering data, archiving and updating it.

After that I intend to create a community infrastructure, in order to:

  • Help people to define themselves by their works, their interests, and their relations with other people.
  • Help people to find someone that they could connect with.
  • Help people to get connected and communicate with other community members.

I also to create an upgraded infrastructure to track and define conversations, so as to:

  • Help people to easily track conversations.
  • Help people to find interesting conversations.

Finally, I plan to create an infrastructure that lets people easily enter into a discussion they have found using Talk Digger.

That is it.

With these new features, Talk Digger will go a step further into the direction of its driving vision: helping people to find, follow and enter into conversations which are evolving on / in the Internet.

Some people may think that I am trying to shovel clouds, but I am not. The development of this “new system” is on the way. In about 2 months I should start to distribute private access to some people to test the new system. That way in about 3 months or so, if everything goes fine, I should publicly release this new version.

Naturally I will keep you in touch with the developments! If anything crosses your mind while reading this post, please leave a comment below… who knows, it certainly might be helpful to me!

Technorati: | | | | | | | |

Semantic MediaWiki

Print This Post Print This Post

I just discovered this Semantic initiative by Markus Krötzsch while reading Danny Ayers.

Wow! What a great initiative – trying to develop and test some Semantic Web ideas and technologies in the mainstream.

What is this Semantic MediaWiki all about?

The WikiProject “Semantic MediaWiki” provides a common platform for discussing extensions of the MediaWiki software that allow for simple, machine-based processing of Wiki content. This usually requires some form of “semantic annotation,” but the special Wiki environment and the multitude of envisaged applications impose a number of additional requirements.

The overall objective of the project is to develop a single solution for semantic annotation that fits the needs of most Wikimedia projects and still meets the Wiki-specific requirements of usability and performance. It is understood that ad hoc implementations (i.e. “hacks”) may sometimes solve single problems, but agreeing on common editing syntax, underlying technology, exchange formats, etc. bears huge advantages for all participants.

You can read the rest to find out what Markus and his team want to do (and not do) with the Semantic MediaWiki initiative.

I didn’t have much time to check its under-the-hood construction because I am overloaded with the development of Talk Digger and another contract I have accepted. However I did want to take the time to write a bit about it and the cool things I found.

The project is really great: they are trying to build a simple and easy to use semantic system, where users would benefit from its power without caring about all the technical stuff. They have well defined goals to guide their vision of this MediaWiki add-on. This Semantic MediaWiki is intended for Mr. and Mrs. anybody, not only academics.

How will semantic web capabilities be received by Internet users? How will they use the capabilities? Will they like the new possibilities it gives them?

These are all open questions that such an initiative will help to answer.

Give it a try

Basically it defines relations between the objects of Wiki articles. These things could be a word, a group of words (referred as value in their documentation) or another article. Read the document to know what the possible relations are, how they work, and how they are defined.

Check out what a Wiki article about San Diego looks like in a Semantic Web environment. The semantic meanings of the main terms are defined directly into the article. These definitions create a RDFS graph that describes the semantic meanings of the article.

Yeah well, what does this change in my life?

There are two possible answers: (1) absolutely nothing, or (2) absolutely everything. Think about it – think about Wikipedia using that add-on in their system, such that people start to define the semantic meanings of its articles.

People would have access to the same content; however now that content would be accessible in a new way: computers would have access to the RDFS graph created with the relations defined by Wikipedia users. So a third party crawler could crawl Wikipedia, then check for this tag in each article:

<link rel=”alternate” type=”application/rdf+xml” title=”…” href=”…” />

Then the crawler would download and archive the RDFS/SMW document annotated to each article. Eventually a RDFS/(SMW?) reasonner like KAON could do marvel with all that meaningful data!

Finally I would suggest trying the Simple Semantic Search.

I will try to find more time later to check out this Semantic Wiki more in depth and to talk a little bit more about it.

Technorati: | | | |

Larry Ellison

Print This Post Print This Post

 
“Dont assume they are right just because they are in authority or because they are experts.. Think things out for yourself. Come to your own judgement.”

– Larry Ellison

 

Mesh – Canada’s Web 2.0 conference

Print This Post Print This Post

Yup, we finally got one! Thanks to Mark Evans, Mathew Ingram, Mike McDerment, Rob Hyndman and Stuart MacDonald for this initiative.

I will try all my best to be there; I should register in a couple of days.

Mesh will take place in Toronto at the MaRS Collaboration Centre the 15th and 16th May 2006.

Many people will be there like Steve Rubel, Tara Hunt, Om Malik, Jason Fried, Stowe Boyd, Amber MacArthur, and many others.

The agenda is not fixed at the moment, but Stuart will put it on their blog as soon as it is nailed down.

For the attendees, you should use the tag Mesh06 when blogging about that event.

Who are currently talking about that event? Check out the Technorati Tags; however, at the moment you can find more results using Talk Digger.

I just found via the Remarkk blog that there is a BarCamp in Toronto the 13th and 14th May 2006, just before the conference. It could be a good warm-up, and it could be a good place to present a new major feature of Talk Digger (in fact it is far more ambitious than the meta-search engines I created with Talk Digger, and in my humble opinion, it have far more potential to connect people and find/follow/enter into discussions evolving on the Web). The only thing I hope is being able to finish a demo by the 12 May; otherwise I could simply talk about Web discussions.

Technorati: | | | | | | |

Answering to: The One Crucial Idea of Web 2.0

Print This Post Print This Post

With this blog post I’m answering Joshua Porter regarding one of his most recent blog posts. To fully appreciate or understand my response, you should read his before continuing to read this post.

What is Web 2.0? Personally I do not agree that Web 2.0 is defined, as widely accepted, by the new social Web services trend that relies on a community to define and “dig” the Web. I would call that Web the “Web 1.5″: a new crucial step toward something much bigger: the Semantic Web; my view of the Web 2.0.

In reality, the new social Web services like Digg, Flickr, Del.icio.us, etc. are not new technologies. These services use old, well-understood methods and technologies. I think that the crucial factor that makes them spread like voracious mushrooms is the drastic decline of the price of their supporting infrastructure: cheap broadband, good open source (and free) developing technologies like MySQL, PHP (no licensing costs) and gigabytes of hard drive space for pennies. This is a form of convergence, not a new Web.

Mr. Porter wrote:

If there is one idea that encapsulates what Web 2.0 is about, one idea that wasn’t a factor before but is a factor now, it’s the idea of leveraging the network to uncover the Wisdom of Crowds. Forget Ajax, APIs, and other technologies for a second. The big challenge is aggregating whatever tidbits of digitally-recorded behavior we can find, making some sense of it algorithmically, and then uncovering the wisdom of crowds through a clear and easy interface to it.

It is all about popularity; it is all about Google Pagerank. But it is one tool amongst many others.

Google offers good services. Google changed the landscape in the search industry. The problem is that I can always spend 1 hour finding something on the Web, and yet what I find is often basically unacceptable.

To upgrade the Web, we should see a breakthrough that drastically upgrades its efficiency. Unfortunately Digg or Technorati have never helped me to decrease search time. They are cool services, but they don’t answer that particular need.

To put the tag 2.0 on the Web, we should see such a breakthrough. It is why I would call the emerging social trend 1.5: a good step forward, but not enough to change the first number of the version.

I have some questions for people who think that the current emerging “Web 2.0″ is a major breakthrough for the Web:

1- What happens if the “crowd” does not find the golden piece of information I am searching for because it is buried too deeply in the Web and nobody noticed it before?

2- Did anyone see an article written on the Canadian government that offers tricks to complete your income taxes form popping-up on Digg?

The problem I see with this method is that something has to be flagged by many, many people to pop-up to the surface – *something* has to be useful to many people that will dig it, link to it, etc. And personally I find useful information all day long, but I don’t or won’t link to that useful information.

I do not want to have the references to resources that meets the needs of *everybody on the Web*; I want to have the references to resources that fill MY needs.

The only time that such methods are really useful is when my needs meet those of the majority. That is often the case when we talk about general information. However it just doesn’t work when I start to search for up-to-date and specific information about an obscure subject, a subject that few people care about, or even more important, a subject about which information has to be inferred in order to be discovered!

What is happening with these new services like Digg, Flickr or Del.icio.us, started with Google’s Pagerank idea, is good and really cool, but I hope this is not a end-point for the next 10 years – otherwise we will miss focusing on something much more useful and important.

And the evidence is mounting. Today, Richard MacManus writes of the new features on Rojo, and in explaining what they are Chris Alden tells Richard that they’re emulating Pagerank:

“How do we do it? (determine relevance) Generally, just like Google used link metadata to determine relevance of search results, there is a fair amount of metadata we can use to infer relevance, including how many people are reading, tagging, and voting for a story, how popular the feed is – both to you personally, to your contacts, and to all readers, as well as things like link data and content analysis. “

When I read this, I think about the Semantic Web: a way to create metadata on resources not to infer relevance, but to infer Knowledge. Relevance is good, at least in some scenarios, but Knowledge is better because it is good in all scenarios. Remember: Knowledge is power.

The problem is that people think about inferring relevance in terms of popularity, people linking and talking about something, and not in term of Knowledge.

I sincerely hope that people will start to talk about the Web 2.0 as a web of Knowledge, a Web of *Semantic usage*. As I said, I would refer as the social Web to the Web 1.5: a first step, a first non-academic and widespread experience toward the Web 2.0: the Web of Knowledge, the Web where you do not lose 1 hour of your precious time searching for something trivial but unfortunately not popular.

Technorati: | | | | | | | | | | | |

Vast: a model for the Semantic Web

Print This Post Print This Post

In the past I talked a lot about the future of the web, the Semantic Web, and what developers and businesses have to do to make it happen.

Many people are currently blogging about a new search engine called Vast. I took a quick look at it and found an impressive service. At the moment, it has semantic capabilities. It’s a normal search engine that crawls the web to search for specific data: cars, jobs, and profiles. It broadcasts this information using a REST interface and no semantic relations between the results are available.

So, why do I say that it is a model for the Semantic Web if it has nothing to do with semantics? Because it has a crucial characteristic needed by most of semantic web services to make the idea of the Semantic Web work.

I have already talked about it in this blog post: sharing its content and computations freely, to anyone who needs it, without any restrictions.

This is exactly what Vast is doing:

“Use the Vast Dataset to Build and Augment Your Own Services – it’s free, open, and available for commercial and non-commercial uses!”

“Vast’s entire dataset is available for you to add to your site, blog, or service. You can rebuild all of Vast.com, if you’d like, offer targeted classified search results to your users, build visualizations or mappings, or process the data to find interesting correlations.”

So, you have an idea, and you need their data to develop it – what will you do? Take the data from their web servers, without any restriction.

The only question you need to ask yourself is: do I trust their reliability to develop my project using their free service? The answer is up to you, and it’s the sort of question developers must repeatedly ask themselves – if a new environment consisting of such web services is emerging (and I think it is).

I also questioned myself on what could be the business model of such a Web service? They have a part of the answer:

“At some point, we will accept payment for advertising embedded in our feeds. At that time, the advertising revenues will of course be shared with developers and partners using our feeds.”

“When Vast decides to embed sponsored links or advertisements in the dataset, you must display these links or ads with prominence alongside or as part of the data. However, it is our plan that you will receive a share of the revenues that you help generate for doing so.”

As simple as crying rabbit. They embedded some sort of sponsored results in their results listing and they force you to display them with their terms of service.

However, who cares? I mean, I have no problem with ads as long as they are relevant to what I am searching for. If their service suggests to me a car found somewhere on the Web or a car result sponsored by someone else, as long as I have a car that fill my needs I do not care where it came from and it is exactly what Vast is doing.

I can only say one thing at this point: congratulations guys for making all your data freely available to anybody, and for having built a viable business model (in my humble opinion) over it.

Technorati: | | | | | | |

Memetracking and Web Feed Reading

Print This Post Print This Post

I read this post a couple of days ago when I was trying to cope with all the things that happened in the Blogsphere while I was traveling. This is a really well written and insightful post wrote by Robert Scoble about Memetracker vs. Web Feed Readers.

[...]

I miss my RSS reading. Reading RSS makes me smarter, not snarkier. Why? Cause I choose who I’m going to read. Pick smart people to read and you’ll get smarter.

Hint, the smartest people in my RSS are usually the least snarky. Why? Cause they could give a f**k about all the traffic.

[...]

I totally agree with Robert on this one, and it is probably a reason why I do not give much importance to memetrackers and that I only subscribe to their RSS feeds: I give them the same importance as any other bloggers.

However, memetrackers and blog search engines have the same problem: when you try to discover new blogs and new articles that may be of interest to you, you always get the same people and the same blog posts.

Unpopular bloggers have really good ideas. However, nobody finds them because they are not popular and they are not popular because they don’t give care at all about being popular.

The problem is that all these services generally use some sort of ranking system; the type of system popularized by Google. However ranking systems are not built to show you the best results, they show you the most popular results with the premise that they are the best; but they rarely are not. So, now – how can I find these bright people? How can I read their awesome ideas?

That’s what I want: I want something that helps me manage the information in such a way that it will aggregate information that may be of interest or use to me, and not necessarily the information that for whatever reason is of interest to the rest of the planet.

Yeah right … I am dreaming in technicolour … and I know that many people have been working on that problem for ages; however, I’m impatient and I can’t wait to see a real breakthrough as it unfolds in front of the general public.

During that time, I want to connect with and talk to people that have the same or similar interests as I do, rather than spending hours weekly trying to find these people using the current services available on the Web.

Technorati: | | | | | | | | |

Pinging people through linking. Have you been pinged?

Print This Post Print This Post

Many people that I talked about in my blog posts, and to whom I linked, have left comments on these blog posts. Now, what I would like to check is to what degree people use services like Talk Digger to find out who’s talking about them, what they’re saying and help them make contact to talk more and/or start a new relationship.

So, what I am going to do is write a list of the names of people I read who are not subscribed to my web feed (or that I think are not). And the intention of this blog post is to ping specific people by using links on my blog.

So, if you are one of these people … it would be nice to leave a short comment on this blog post telling me that you found it via the link I’ve made to your blog letting me know what you used to find it (it’s not obligatory that this be Talk Digger ;) )

So there is the list:

Danah Boyd, Robert Scoble, Darren Rowse, Steve Rubel, Andy Wibbels, Toby, Scott Ginsberg, Paul Graham, Seth Godin, Jeff Cornwall, David H. Beisel, Amber Mac, Anil Dash, Daniel Lemire, Jack Vinson, Lilia Efimova, David Sifry, Matthew Hurst, John Battelle, Kevin Burton, John Tropea, David Weinberger, Michael Arrington

I have no idea what the results of this experiment will be , but I think that they could be interesting and maybe even surprising.

Technorati: | | | | | |

Better English, better blog posts

Print This Post Print This Post

You will probably notice that the English of my blog post will upgrade considerably in my next posts. This is not magic, and no I didn’t implement an English language micro-chip in my brain. Everything is the grammar correction work of Jon Husband, a good friend of mine. He told me: “Fred, if you want that I continue to read your blog, I will have to correct your posts, otherwise I stop, I can’t continue anymore!”

Okay, it is not exactly what he said, but I would have understood! Nah, Jon kindly told me that he would be willing to correct my blog posts before I publish them, so I would be able to know the English errors I make habitually, and thus begin to accelerate the improvement of my English skills. Naturally, I said yes to his proposition!

That said, it’s a win-win game: I will continue to upgrade my English skills (and there is a lot of room for that) and you will begin to read better-written English blog posts.

Thanks Jon.

Is there place for a Meta-Memetracker and what would be its utility?

Print This Post Print This Post

I came across a seed idea spread on the FeedBlog, wrote Kevin Burton, yesterday (using Talk Digger of course, you see the link to it in the blog post? It is the reason why linking is so important ). He pointed out an idea that Dave Winer gave for free 3 days ago on his blog.

The idea?

“Implement a search engine that accumulates all the stories pointed to by the top meme-engines over time. That way if I think of something I saw on Tailrank or Memeorandum a year ago, I just go to the universal meme search engine, type in the phrase, and get back the hits.”

Kevin was thinking about something a little bit different: a meta-memetracker that would look like Talk Digger.

I think that there is a place (at least emerging) for such a service considering the growing number of memetracker out there (TailRank, Memeorandum, Findatory, Megite, and probably others that I do not know of (I found yesterday a sort of memetracker on Rojo’s main page that is really cool)).

What would be the added value to users? The first thing is that you would have only one place to visit to get the top stories (obvious behavior for a meta-memetracker, no?).

However, I think that a more interesting phenomenon would happen too. The thing is that none of these memetrackers use the same methods/algorithms to find out what is a good story. Some seems to works with links and predefined list of good information sources selected by humans, other probably user some sort of advanced natural language processing algorithms, other a mix of these two methods and other probably use methods that I can’t think of.

All the memetrackers have one thing in common: they aggregate stories they think that are good (are they performing users profiling? It could be one next step to increase the effectiveness of these services Dave).

This said, some stories appear on all memetracker and other only on one of them. So, if one algorithm doesn’t score well for a specific story, it is not really a problem because the strength of the meta-memetracker is that it would prioritize the set of results composed by the intersections of the sets of results returned by each memetracker. That said, the meta-memetracker would return the bests of the bests stories because the error rate would be blended by the intersection of results’ sets.

It was my two pennies

(if you would like to read more about the socio-philosophical background of popularity, read that blog post wrote by Joshua Porter a couple of days ago)

Technorati: | | | | | | | |