Ha-ah! We noticed something was up. Google News was acting odd - it was adding poorly related stories to breaking news clusters. Normally Google is very good at keeping news clusters tightly associated.

Today we were lucky enough to publish details of a Google survey and include comments from Bill Slawski. It's a look at data privacy with the search engines.

We can see in the grab above how the bigmouthmedia press release gets lumped in with a whole bunch of others. The other 295 or so in that cluster are about Google working with US States and their databases. Google's normally better than that.

Then we add non-bigmouthmedia stories ranking for the search - as shown below.

What's happened next? Google linked in a new search filter for Google News. Previously you could search by date or by relevance - now you can search by date with duplicates included.

Normally you get heavy duplicates in Google news when someone like the AFP or Reuters publish a story which local news papers across the USA pick up and republish word for word. This filter option seems like a way to cope with that.


louisgray said…
Andrew, nice job. I posted on this topic around 7 Pacific Time, so it looks like you got me by three hours! Google Blog Search still shows you and I are the only ones who caught it so far. :-)
Google News Search Removes Duplicates

TechMeme doesn't have it, nor do the rest of the big guys. I bet they wake up to it shortly...

