Hari's Corner

Humour, comics, tech, law, software, reviews, essays, articles and HOWTOs intermingled with random philosophy now and then

Is the internet really polluted by blogs?

Filed under: Internet and Blogging by Hari
Posted on Mon, Nov 21, 2005 at 10:56 IST (last updated: Wed, Jul 16, 2008 @ 20:24 IST)

Mega Man X recently brought up a very valid question about blogs in the LinuxQuestions.org forums. Here's what he has to say on the subject:
But what is annoying me the most today, are indeed the blogs. I was just searching the web for some Ubuntu backports for Breezy and I hit a blog with a guy talking about Chuck Norris. The whole thing is a mess (not that the Internet has ever been organized either)

I think somehow soon we will need an advanced search option on google to bypass useless blogs, if that's not possible already. Does blogs annoy anyone else? Or even worse... do you have a blog? If so, why?. Is it stuff worth reading or is it just like a diary? Because I believe peoples browsing the web could care less that your dog or cat is sick, what you've ate at school and what you did last summer...

Source: LinuxQuestions.org forums
I must say that the concerns raised are quite valid although I beg to differ on the subject of severity of the problem. The vast network of interlinked blogs out there have certainly gained a degree of visibility on many search engines out there. And I have to accept that a large number people are increasingly annoyed by the growing number of online diaries and journals that are sprouting up like wild mushrooms by the day. The sheer number of blogs out there render most search engines helpless in filtering them out completely. It's not merely a question of blocking out all blogspot.com addresses out there for instance because the problem goes way beyond that.

Because, as end users of the vast ocean called the internet, we cannot change anything - that's for certain. Moaning and groaning about it will not help. Rather, we should adapt to this situation and explore techniques to separate the wheat from the chaff. And believe me, while useless personal blogs have grown, the number of quality websites providing a rich wealth of information and education have grown as well. Ultimately it's a question of perception and how well one is tuned to take the good and filter out the rest. Crap is crap, whether you find it on a blog or whether you find it on a corporate or business website; whether you find it in a personal diary or in a regular, mainstream newspaper.

Let's admit it. Searching the web is inherently limited because while we can input keywords to search for occurrence of words, we cannot input ideas to search for relevant content. Let me take an example: today I want to read any essay which talks about the issue of "quality over quantity." It is extremely hard to find a generic one on this particular subject simply by entering quality over quantity in google, because my search has more to do with the idea rather than the actual keywords. Google obviously doesn't recognize that fact and hence provides less than satisfactory results. It throws up topical pages on other issues which have these words "quality over quantity" and not an essay dedicated to this topic as such.

Another factor is that search engines don't necessarily index every single website out there and that SE ratings can sometimes be seriously flawed. Search engines can only look for quantitative factors: number of links pointing to a site, number of occurrences of keywords but not necessarily how those occurrences are relevant to the search on hand. In other words, the search engine cannot rate the quality of those sites which have a higher rank. We try to cut down this discrepancy by refining searches, but ultimately if a site is not indexed by google, that site will not occur in google results, no matter how hard we try. Many times, I've given up on searching because the quality of the results have simply not justified the time spent in doing it. Do a broad search and you're swamped with irrelevant results. Do a more refined one and you get only two hits, both of which have almost nothing to do with what you wanted to find in the first place. This has been my experience more often than not.

I am certainly no expert on search engine technology, but I believe it has more to do with the skewed methodology rather than the content. And I also think it's a by-product of search engines not being able to keep up with the current growth of the world wide web. But blogs are but drops in the ocean. I don't think they are so important that they get higher weightage in SE ratings just because they are linked to a dozen similar blogs. On the contrary my observation is that blogs certainly do not "dominate" search results, although they might admittedly have more visibility in searches these days. And you certainly get irrelevant results from other websites as much as you do from personal blogs.

I think singling out blogs is unfair. There are certainly useless blogs out there, but there are worse kinds of nonsense going on in the internet and in much higher volume than inane personal ramblings or diary entries of a bored person. Generalized observations such as "blogs are the crap of the internet" miss this perspective. While I admit that search engine results need to keep improving over time and blogs probably have to be filtered out where irrelevant, it's certainly not such an important issue when we think about the other kinds of trash littered all over cyberspace. If the world wide web is polluted, blogs certainly are nothing more than minute specks of dust in a room full of rotting, stinking garbage.

13 comment(s)

  1. I too have never come across an "intelligent" search engine in my life (and my Internet life is 11 years old now). Context-sensitive help is available for stand-alone applications but by the very nature of the web, context-sensitive search can never happen unless AI technology improves by a factor of 10 from here. It is the same old problem as the speech recognition software - "night rates" vs. "nitrates" :-)

    Comment by thennavan (visitor) on Mon, Nov 21, 2005 @ 11:43 IST #
  2. thennavan, your comparison between current SE technology and speech recognition is quite apt. Search engine technology probably lies in the fuzzy area between the quantifiable elements and the qualitative elements. Right now it scores in the quantitative area but fails miserably in the qualitative side.

    Comment by hari (blog owner) on Mon, Nov 21, 2005 @ 11:54 IST #
  3. I tried a modified version of your search in Google ("quality over quantity idea";), and the first result was something like an essay......in a blog!http://organizedlife.blogspot.com/2005/07/choosing-quality-over-quantity.htmlI think Megamanx is just scapegoating blogs. There's junk everywhere, and searching for stuff it always hard. Junk can be blogs. Junk can be other pages, too.

    Comment by aysiu (visitor) on Mon, Nov 21, 2005 @ 12:14 IST #
  4. Who's to say that the information contained in a "real" website is any better then a guys blog? The problem with people like that is that they think that just becasue it is maybe written a little more professinal, and gets edited by someone who gets paid to edit, somehow the information is more accurate. How many times in recent years have the media gottena story wrong. Is the weatherman always right? If I want a movie review, I would much rather take the word of a blogger than to hollywood jokers selling their show on national tv who, are obviously out of touch and don't know what a good movie is.

    Comment by Brad (visitor) on Mon, Nov 21, 2005 @ 15:22 IST #
  5. aysiu, yes, I guess you're right about that. It's interesting, because a blog might actually be more relevant than an actual website. (BTW, your comment was auto-moderated because it contained a link. It is a spam-protection move. Sorry for the delay in allowing it to appear here.)Brad, yes - good points. BTW, what I find more annoying is the fact that many of the "real" websites are so out-dated that many of the information contained are several years old. On the other hands, blogs with frequently updated content sometimes give us better results than old regular websites.

    Comment by hari (blog owner) on Mon, Nov 21, 2005 @ 18:39 IST #
  6. Websites (traditional) *oh boy! the Internet is growing up! we have traditional and nouveau mediums! *sorry- websites don't always have to be made by mega corps with paid staff- but the individual isn't going to be able to pull the hard yards of keeping the website updated. A blog, on the other hand, with the blogging software pulling the hard yards for you, is much easier to keep up to date.

    Comment by titanium_geek (visitor) on Mon, Nov 21, 2005 @ 19:06 IST #
  7. Titanium_geek, dynamic content management systems have changed the way the www works. Blogs are just one part of them. :)

    Comment by hari (blog owner) on Mon, Nov 21, 2005 @ 20:08 IST #
  8. The search "quality over quantity" is a great example. Without giving extra words to put it into context, the results will be mostly chaff.However, I believe that Google (no links - I read it "somewhere" on the 'net) is aiming to come up with a search which takes context into account. So if you were looking at an electrical store site and searched for the word "television", you would have a list f searches returned that focused on tech and price comparisons and excluded "the history of television" and anything similar.It will be slow, the problem has only really come up as the size of the internet has grown to it's present (and future) size.

    Comment by Ray (visitor) on Mon, Nov 21, 2005 @ 23:57 IST #
  9. Interesting Ray. I wonder how google will address concerns of privacy, though, when taking into consideration the fact that they will monitor the search history.

    Comment by hari (blog owner) on Tue, Nov 22, 2005 @ 06:50 IST #
  10. Oh come on - anybody who finds blogs to be a serious search problem really isn't trying: Add "-blog" to your Google search and try again. . .

    Comment by Dominic (visitor) on Tue, Nov 22, 2005 @ 18:18 IST #
  11. Dominic, many people do get frustrated by their own limitations in searching as well as the limitations of the current search engine technology. Trying to find a cause is the most natural thing and people tend to point fingers at convenient targets. Blogs just happen to be convenient targets.While not blaming the people who get frustrated by blogs, I do think they over-generalize and exaggerate the issue and miss the perspective.

    Comment by hari (blog owner) on Tue, Nov 22, 2005 @ 19:38 IST #
  12. Hey, Hari! Nicely said. Maybe some developer at google should read this :)

    Comment by Sudipta Chatterjee (visitor) on Fri, Nov 25, 2005 @ 12:58 IST #
  13. Thanks, Sudipta! :)

    Comment by hari (blog owner) on Fri, Nov 25, 2005 @ 14:06 IST #

Comments closed

The blog owner has closed further commenting on this entry.