I'm being a grumpy bear about the WebmasterWorld anti-robots/anti-search situation.
WebmasterWorld is a large and popular forum where webmasters can discuss site issues. Important sections on the site are the SEO and search engine threads.
The forum has recently banned search engines from indexing it and, as a result, it has been dumped from Google and the others.
Why would a forum which needs it web traffic do this? We're told that they had to. I don't like that one bit - it sends an anti-search and anti-SEO message.
WebmasterWorld is highly respected. I would say many web savvy clients check out WebmasterWorld (and, of course, many small SEO firms camp out there). I personally believe the quality of the forum has nose dived; it's now full of newbies who sound like a broken record and ask the same questions again and again, it's peppered with veterans who no longer care to respond to the broken record.
This isn't the search engine problem though. We're told that WebmasterWorld is plagued by "bad bots". These are automated user agents (like search engine spiders) which crawl the site and take the content. There is so much of this activity that the forum's web servers struggle to keep up.
The webmaster's response to these bad bots at WebmasterWorld is to block all bots via the robots.txt "robots exclusion protocol".
This is foolish. Bad bots do not obey this protocol. This change will only effect good bots.
At WebmasterWorld they've made other attempts to block the bad bots - they've changed the site so that you must now log in before you can view any real data. This is the draconian tactic to take if everything else fails. The robots.txt is just stupid.
Normally, I wouldn't care less. WebmasterWorld can stab themselves in the foot if they wish.... except the site has the SEO profile that puts it in the limelight somewhat. I believe the site's actions give out the wrong message.
It is possible to deal with bad bots without blocking your site from search engines. It is. Further more, blocking your site from search engines does not stop bad bots (though it makes the site harder to find!)
We're told that WebmasterWorld has tried a whole host of complicated and thorough defenses and that they all failed. I don't get it. If the forum had those resources to hand then they certainly have the resources to re-design the site to be less of a bad bot target. For example, at times of heavy load the site could ask for user input, a captcha or a question. I came up with that crazy theory in about 2 seconds of thought.
Newspapers are hugely scraped by bad bots and they deal with the issues WebmasterWorld has caved in on.
WebmasterWorld have cited impressive figures to support their scraping problem - but I've never found a single scraped entry from WebmasterWorld. Do thousands of people really download the site for their own personal desktop edition? An edition which would require a desktop search (rather than Google) to search and which would need to be kept up to date? Meh.
We're told this is a test. I predict we'll see that WebmasterWorld no longer has the pull to attract the SEO experts (especially after this) so I don't think it'll become an exclusive oasis. I suspect we'll see it back in the search engines when they retract their robots.txt change. I imagine there will be complaints that the robots.txt rules should be changed (all a red herring since bad bots ignore it).