I wrote an article on Microsoft’s IE7 earlier today but it didn’t show up in a couple of places I’d have expected to have seen it.
Curious, I did a bit of looking around. I searched several search engines and was surprised to find that in Google’s BlogSearch, not alone does my article not appear, but a copy of my article appears linked to Irishblogs.ie.
Irishblogs.ie is an Irish blog aggregator. Why is their copy of my article appearing in the Google search results (search position 8 in the results above) while my site is not found at all?
Should I stop irishblogs.ie from taking my content?
If you enjoyed this post, make sure you subscribe to my RSS feed!
At first I thought it might be the same problem Dermod was having with IrishBlogs.ie, but it’s not. If you notice, the results are returned in reverse chronological order, so it makes sense that the IrishBlogs.ie listing would be before yours. If anything, the fact that both your post and the IrishBlogs.ie post appear in there gives you the boost.
Hey Tom
As you are the SEO head then I have to assume that portion of your question (Why is their copy of my article..) is rhetorical!
Otherwise it is a little strange I guess - but I assume you can pull from irishblogs.ie if you wish? And I also assume it is not either/or - you can have both the aggregator and the source listed at the same time? So many questions, so late in the evening!
Finally I guess I would be more concerned if I were you about my content being used for profit which is not currently the case with irishblogs.
keith
Also, did you try sorting by relevance rather than by date? It’s not obvious from the screenshot as you clipped the right hand side.
I hate to be pissy but what’s with the whole Keith vibe? Can you delete the other one Tom - just to save me getting confused when I see comments from Keith coming in that I know I did not write!
keith (I am the One)
Keith(s),
the major issue I have is that my site was not listed anywhere in the Google search results. Only the Irishblogs copy of my post with a link back to Irishblogs. No link to my site.
Which leads to the question Tom - is it either/or?
Does it’s inclusion reduce the chances of a listing for the original post?
Or even if the Irishblogs.ie listing was not there would your original not be there either?
Cause without that kind of info I find it difficult to judge the context of your hurt
keith
How do you think I feel, Keith. I think we should settle this Highlander style.
Nowhere? That’s definitely odd. Have you checked with other articles you’ve written recently where there wasn’t such a big buzz?
The problem could also be down to FeedBurner. When it munges your feed, it doesn’t use the original URLs or a permanent redirect. Instead it uses a temporary redirect. This could be effecting your listing, but you’ll have to talk to somebody who works at Google on blog search about that.
One other thing: Regardless of this issue, I think it would be worth it for Roger to have the IrishBlogs.ie, &c. feeds changed so that they use the URLs of the original posts rather than pointing them to their placeholder.
Hi Tom,
Google’s indexing bots traverse sites more or less frequently depending on the frequency of content updates. Hence fast moving sites like Irishblogs.ie get scanned for new content more frequently. I’d wager that your own site will be listed within a few days.
KeithB - the inclusion shouldn’t reduce the chances of my post appearing if Irishblogs were properly configured.
If my post isn’t included but my content is and it is linked to Irishblogs, then who is benefiting from my writing? Certainly not me. Irishblogs are getting content, PR and links from my work.
KeithG - there wasn’t a big buzz around this article. I just happened to check. I think irishblogs is destroying my Google ranking because I’m seen to be writing work which is also on a site of a much higher Google PR, i.e. algorithmically, Google may think I’m copying irishblogs.
I considered that possibility myself before I mentioned that. What I did was take the URL of Richard Waghorne’s screed on how useless he considers Irish and check for links back to it. Many of the hits I got for my (lengthly) rebuttal came from IrishBlogs.ie initially and later from other sources–my blog software is set up to automatically use my blog pinger to ping it and others. Sure enough, it comes up second from the bottom.
Now, the one big differenciator between your blog and mine is–besides the fact I use software I wrote myself and you use WordPress–that you use FeedBurner, hence my suspicion that the problem lies with how Google’s blog search and FeedBurner interact.
As I mentioned, FeedBurner uses a temporary redirect rather than a permant one. That’s like saying “this URL’s the canonical one, but go here for now”. But there’s no content on that page.
Now, it could be argued that Google should follow the redirect anyway and use that, but that could break genuinely correct uses of permanent and temporary redirects.
The IrishBlogs.ie feeds have some genuine problems (Roger should get his programmers to read what Sam Ruby’s written about his work on Planet Intertwingly), but my suspicion is that it’s not the true source of your woes.
But to be certain, you could ask Roger to stop fetching your feed for now and see how things go.
And how could IrishBlogs.ie be destroying your page rank? The content on IrishBlogs.ie is quite transient. If it held onto it like it did before Dermod got them to stop, I could understand, but that’s not happening. He also, as I said, needs to get it using the original post URLs rather than IrishBlogs.ie’s internal ones. That’d make it more aggregatorish and less weblogish.
There’s also the issue of HTTP headers and feed metadata. In both cases, your entry would appear older than the IrishBlogs one, and it’s likely that Google would use this heuristic to judge which one to use.
One other thing: Donncha Ó Caoimh’s post on the same subject points to IrishBlogs.ie, but seeing as he recently moved to a new WordPress site and he’s still redirecting entries over from the old site, this is to be expected. That entry was first written on the old site.
Thanks for the clarification Tom. And to the (other) Keith for his follow ups as well. This is more complex (for me anyway) than I first thought. I have always been happy to be aggregated - however I can see how it might not suit you Tom.
keith
BadMan - my site used to be listed in Google’s BlogSearch within minutes of my posting. I think you may be confusing Google’s search with Google’s BlogSearch. They are quite different.
Keith - I’m not sure that FeedBurner is an issue because my use of FeedBurner is transparent. My feed is at tomrafteryit.net/feed not at feedburner.com/whatever.
Irishblogs could easily be destroying my PageRank if Google thinks I am plagiarising IrishBlogs content.
You are correct that Roger needs to get it to use the original post urls though.
I’m not sure that FeedBurner is an issue because my use of FeedBurner is transparent. My feed is at tomrafteryit.net/feed not at feedburner.com/whatever.
Oh, it’s not that link I’m talking about (though it’s a good idea to do that if you do ever move away from FeedBurner), it’s the ones FeedBurner generates in your feed for your entries so it can track who’s clicking through and reading them from your feed. For instance, the FeedBurner URL of this post is http://feeds.feedburner.com/~r/tomrafteryit/TiBm/~3/6102708/ and it does a temporary redirect towards this page. It’s not entirely unbelievable that this could be what’s stopping it from appearing in the blog search and why IrishBlogs.ie’s entry is the one coming up.
He also needs to get the text area on the contact form made a lot bigger.
I see what you are saying Keith but I don’t understand why this would suddenly be the case. I have been using FeedBurner for my feeds for ages with no obvious problems and my posts have appeared in Google very quickly until recently.
I wonder what has changed.
You are correct to say though that Irishblogs should “use the URLs of the original posts rather than pointing them to their placeholder”. Until they do that I think I will ask Roger to remove me from their list of feeds.
Finally, although, as you say “the content on IrishBlogs.ie is quite transient” (i.e. not cached) - it is still being cached on corkblogs - this could be causing problems too.
Hmm
I’m quite new to the blogging end of things and the dupe content theory is quite interesting and something I will have to watch from and SEO perspective.
One thing that does come to mind though - recently there have been many anomalous changes in Google’s SERPs. Google seems to be making changes that are affecting the SERPs (perhaps in an effort to clean up their index).
You may have been caught in some overnight changes? I know that many webmasters were reporting problems in and around July 27.
I’m not familiar with how Google BlogSearch works. I presume it is simpy a separate index derived from Google’s normal crawl?
Maybe wait a few days to see if it settles down?
I am looking into this.
We have always linked directly from the posts on the front page to the blog in question (apart from a couple of days about a year ago when we were transitioning from an old system to a new one). We used to have a “cached” page but we changed that page to remove the content and removed all links to it (at least that’s what we thought but we musn’t have done so if Google is able to find it). We left the title with a link back to the bloggers site so for old links people would be able to find the original content, which we believed would be in the best interest of the bloggers.
Found it! It’s the link in our RSS feed that Google is picking up. Will be changed on Monday.
I have been using FeedBurner for my feeds for ages with no obvious problems and my posts have appeared in Google very quickly until recently.
Then it sounds like Google might be tightening up how they deal with temporary redirects. If the problem persists after Roger has it fixed on Monday, I’d complain to FreeBurner and get them to turn their temporary redirects into permanent ones.
One thing I’m a bit puzzled by is how you think this might be effecting your PR. Blog Search is completely separate from Web Search, the former using feeds alone, the latter using Google’s spider. I doubt they interact.
If we remove the link in the feed (which may have been a legacy FeedWordpress issue) there should be no way that it can appear on Google’s Blogsearch as we do not link to that old cache pageholder (which doesn’t display the blog post anyway) from anywhere - at least as far as I’m aware.
The issue seemed to arise with non Feedburner users as well so I’m not sure that this particular problem was caused by them (although at first I did suspect something along the lines of the old 302 redirect issue caused by FeedBurner). They either didn’t get Tom’s feed or ditched it based on a similarity measure. I guess the latter. Sorting out the link in the feed should resolve it.
We have changed the link in the RSS feed directly to the blog. This was indeed legacy code from FeedWordpress (which we have practically totally re-written). So the above problem should not occur again.
Regards,
Roger