Friday, December 28, 2007

Google blocking Yahoo Pipes - again

Did you know that Google has a bot which ignores robots.txt and does so defiantly? It's true. Google's RSS grabber, Feedfetcher, ignores robots.txt as Google reasons a human decided to publish the feed and a human has decided to request the feed. It's all explained over at the webmaster help center. I actually think Google's made the right call here, although it means you can't slam the brakes on an RSS by slapping up a robots.txt block and I'm beginning with this just to set the precedent.

I like to think one of my real scoops this year was when I noticed that Google seemed to be blocking Yahoo Pipes. Only yesterday I noted that I was disappointed that Sphinn didn't like the story but pleased that Wired writer Betsy Schiffman had.

I do believe that this blockage was temporary and accidental. Google have said complementary things about Yahoo Pipes before and you can use Yahoo Pipes to take data from Google Base. In fact, Yahoo Pipes and Google Base have been a featured project on Google Code.

In a quirk of timing, bigmouthmedia colleague and Wonga World blogger, Chris Cathcart pointed out that Google's Feedburner is also blocking Yahoo Pipes.


This time the blockage is certainly not an accident but is a human controlled decision. Why would Feedburners want to keep their RSS out of Yahoo Pipes? One possible answer is that although the publisher is happy to distribute content (or teasers) in a feed they don't want that content to be sliced, diced and mixed up with other content. One of the ways I use Yahoo Pipes is to monitor dozens of feeds but only alert me when a story is gaining a critical mass, this means I don't need to manually review all those feeds nor even look at any adverts inside them.

Here's the plug for Wonga World! Chris is our Senior Strategist in the Finance vertical. He's years of experience working in banks and digital marketing. In fact, he spoke at SMX London this year. Wonga World is written with that savvy financial sector bias which is why he gave me this 'search only' lead. What a nice man.

5 comments:

Matt Cutts said...

Just stumbled across this report. If I understand it correctly, it looks like Feedburner added an option to block Yahoo Pipes. But it doesn't sound like Google is blocking Yahoo Pipes from (say) blog or news search results, right?

Andrew Girdwood said...

Oh hello, fancy seeing you here. :)

Your exactly right. It's a different sort of block and why I posted about Feedfetcher being example of a human decision being able to override standard crawling assumptions.

The first 'block' was, I'm sure, an accident.

This 'block' is a human opt-in. I imagine it's been applied to Yahoo Pipes over and above other aggregators because with Yahoo Pipes you and slice'n'dice the data/feed content far more thoroughly.

Christopher Kata said...

I still don't see the need to block Yahoo Pipes! I've been using it much as you have described and my usage stats of someones feed should be accordingly reflected!

Matt Ellsworth said...

Thanks for noting this - I use yahoo pipes all the time and I have even had to move a pipe into feedburner to get it to work correctly.

So I'm kind of surprised that this one is there. I'll be reading more on this later.

Michael Blowers said...

I know I am a bit late on this one but is seems that Google Docs Spreadsheet RSS importfeed function won't take a Yahoo Pipes feed either...