Jakob Nielsen wrote a great article about the relationship between search engines and web sites:

“I worry that search engines are sucking out too much of the Web’s value, acting as leeches on companies that create the very source materials the search engines index.”

There is the problem:

“The traditional analysis has been that search engines amply return the favor by directing traffic to these sites. While there’s still some truth to that, the scenario is changing.”

Jakob articulate his idea around many facts. I would like to talk about a specific feature of search engines: the cache. I think that the first search engine that cached and broadcasted its crawled web pages’ content is Google (am I right? However, it does not change anything to the story). This is a great feature for users: even if the webpage goes down its content is accessible forever (as long as Google, MSN, Yahoo, etc. live).

The question is: is it legal, or at least moral? The thing is that a Google, MSN or Yahoo users can browse the web without leaving these search engines. I will take of my time to create content; a company will spend money to create other content; and these search engines will shamelessly get that content, index it, and broadcast it without redirecting users to these website (most of the time they will, at least I hope).

There are two vision of the situation: (1) everything that is on the Internet should be free (I think that 40% or 50% of the Web’s content is created by hobbyist without earning a cent out of it); (2) the second is that content creators always have rights, even on the Web.

Which vision do I have? Probably one between these two; something of common sense. However, a problem I noted is that companies like Google will tell you what you can and cannot do with their “content” (it is their computation, but not their content, sorry) with highly restrictive terms of service. When reading such texts, something like that come up in my mind: do what I say, not what I do.

As I already noted in a previous post, John Heilemann wrote in the New York Metro:

“Alan Murray wrote a column in the Wall Street Journal that called Google’s business model a new kind of feudalism: The peasants produce the content; Google makes the profits.”

My current work with Talk Digger and my recent readings on the Blogsphere make me think about: how search engines will evolve; how people will react to this new situation; and will a new type of [search] service will emerge from that emerging environment?

Finally, it seems that Google’s crawlers also give some headaches to Daniel Lemire:

“Some of you who tried to access my web site in recent days have noticed that it was getting increasingly sluggish. In an earlier post, I reported that Google accounted for 25% of my page hits, sometimes much more. As it turns out, these two issues are related. Google was eating all my bandwidth.”

Technorati: | | | | | | | | | |

One thought on “Search Engines are vampires that suck blood out of web pages

  1. Interesting topic. I like the ability to view a cached page when the original document is no longer available (or temporarly unavailable.) This feature saved my life a couple of time (ok, maybe it just saved 5 minutes of my life…) As a Google user, I’m all for it. As a blogger, though, I’d like something like “You can only access the cached page if the original document is no longer available.” Can this be implemented? I think it would be more difficult than one would think.

Leave a Reply

Your email address will not be published. Required fields are marked *