{"id":553,"date":"2006-01-21T16:01:41","date_gmt":"2006-01-21T20:01:41","guid":{"rendered":""},"modified":"2006-05-22T10:56:51","modified_gmt":"2006-05-22T14:56:51","slug":"search_engines_are_vampires_that_suck_bl","status":"publish","type":"post","link":"https:\/\/fgiasson.com\/blog\/index.php\/2006\/01\/21\/search_engines_are_vampires_that_suck_bl\/","title":{"rendered":"Search Engines are vampires that suck blood out of web pages"},"content":{"rendered":"<p><a href=\"http:\/\/www.useit.com\/alertbox\/search_engines.html\">Jakob Nielsen wrote a great article about the relationship between search engines and web sites<\/a>:<\/p>\n<blockquote><p><em>&#8220;I worry that search engines are sucking out too much of the Web&#8217;s value, acting as leeches on companies that create the very source materials the search engines index.&#8221;<\/em><\/p>\n<\/blockquote>\n<p>There is the problem:<\/p>\n<blockquote><p><em>&#8220;The traditional analysis has been that search engines amply return the favor by directing traffic to these sites. While there&#8217;s still some truth to that, the scenario is changing.&#8221;<\/em>\n<\/p><\/blockquote>\n<p>Jakob articulate his idea around many facts. I would like to talk about a specific feature of search engines: <strong>the cache<\/strong>. I think that the first search engine that cached and broadcasted its crawled web pages&#8217; content is Google (am I right? However, it does not change anything to the story). This is a great feature for users: even if the webpage goes down its content is accessible forever (as long as Google, MSN, Yahoo, etc. live).<\/p>\n<p>The question is: is it legal, or at least moral? The thing is that a <a href=\"http:\/\/google.com\">Google<\/a>, <a href=\"http:\/\/search.msn.com\">MSN<\/a> or <a href=\"http:\/\/yahoo.com\">Yahoo<\/a> users can browse the web without leaving these search engines. I will take of my time to create content; a company will spend money to create other content; and these search engines will shamelessly get that content, index it, and broadcast it without redirecting users to these website (most of the time they will, at least I hope).<\/p>\n<p>There are two vision of the situation: (1) everything that is on the Internet should be free (I think that 40% or 50% of the Web&#8217;s content is created by hobbyist without earning a cent out of it); (2) the second is that content creators always have rights, even on the Web.<\/p>\n<p>Which vision do I have? Probably one between these two; something of common sense. However, a problem I noted is that companies like Google will tell you what you can and cannot do with their &#8220;content&#8221; (it is their computation, but not their content, sorry) with <a href=\"http:\/\/www.google.com\/terms_of_service.html\">highly restrictive terms of service<\/a>. When reading such texts, something like that come up in my mind: <em>do what I say, not what I do<\/em>.<\/p>\n<p>As I already noted in a previous post, <a href=\"http:\/\/newyorkmetro.com\/nymetro\/news\/columns\/powergrid\/15202\/\">John Heilemann wrote in the New York Metro<\/a>:<\/p>\n<blockquote><p><em>&#8220;Alan Murray wrote a column in the Wall Street Journal that called Google&#8217;s business model a new kind of feudalism: The peasants produce the content; Google makes the profits.&#8221;<\/em><\/p><\/blockquote>\n<p>My current work with <a href=\"http:\/\/www.talkdigger.com\">Talk Digger<\/a> and my recent readings on the Blogsphere make me think about: <em>how search engines will evolve; how people will react to this new situation; and will a new type of [search] service will emerge from that emerging environment?<\/em><\/p>\n<p>Finally, <a href=\"http:\/\/www.daniel-lemire.com\/blog\/archives\/2006\/01\/18\/google-was-eating-all-my-bandwidth\/\">it seems that Google&#8217;s crawlers also give some headaches to Daniel Lemire<\/a>:<\/p>\n<blockquote><p><em>&#8220;Some of you who tried to access my web site in recent days have noticed that it was getting increasingly sluggish. In an earlier post, I reported that Google accounted for 25% of my page hits, sometimes much more. As it turns out, these two issues are related. Google was eating all my bandwidth.&#8221;<\/em>\n<\/p><\/blockquote>\n<p><font face=\"Arial, Helvetica, sans-serif\" size=\"-2\">Technorati:   <a href=\"http:\/\/technorati.com\/tag\/Search\" rel=\"tag\" target=\"_blank\">Search<\/a> | <a href=\"http:\/\/technorati.com\/tag\/engine\" rel=\"tag\" target=\"_blank\">engine<\/a> | <a href=\"http:\/\/technorati.com\/tag\/google\" rel=\"tag\" target=\"_blank\">google<\/a> | <a href=\"http:\/\/technorati.com\/tag\/yahoo\" rel=\"tag\" target=\"_blank\">yahoo<\/a> | <a href=\"http:\/\/technorati.com\/tag\/msn\" rel=\"tag\" target=\"_blank\">msn<\/a> | <a href=\"http:\/\/technorati.com\/tag\/soe\" rel=\"tag\" target=\"_blank\">soe<\/a> | <a href=\"http:\/\/technorati.com\/tag\/se\" rel=\"tag\" target=\"_blank\">se<\/a> | <a href=\"http:\/\/technorati.com\/tag\/copyright\" rel=\"tag\" target=\"_blank\">copyright<\/a> | <a href=\"http:\/\/technorati.com\/tag\/content\" rel=\"tag\" target=\"_blank\">content<\/a> | <a href=\"http:\/\/technorati.com\/tag\/webpage\" rel=\"tag\" target=\"_blank\">webpage<\/a> | <\/font><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Jakob Nielsen wrote a great article about the relationship between search engines and web sites: &#8220;I worry that search engines are sucking out too much of the Web&#8217;s value, acting as leeches on companies that create the very source materials the search engines index.&#8221; There is the problem: &#8220;The traditional analysis has been that search [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":""},"categories":[64],"tags":[],"class_list":["post-553","post","type-post","status-publish","format-standard","hentry","category-web"],"jetpack_featured_media_url":"","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/553","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=553"}],"version-history":[{"count":0,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/553\/revisions"}],"wp:attachment":[{"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=553"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=553"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/fgiasson.com\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=553"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}