Posted on

Magento, Google XML Sitemaps and my Magento Speed Test

My Magento performance testing tool Magento Speed Test uses a Google sitemap.xml to determine which urls should be tested. Having only just released the latest version I have been keeping an eye on the testing over the last few days and have noticed a few tests that never got results. This sometimes happens because of a mistake by me, but more often it’s because people find creative ways to muck up their sitemap.xml, and so the test then runs on no urls, and thus – no results. Here are some tips to check for problems.

After I sighed publicly about the various issues with sitemaps I was seeing a few internet friends asked for some more information, so here is a guide to a good sitemap.xml for performance testing Magento – I’ll do my best to keep updating this as I find more issues.

Sitemap issues

Like any diligent programmer I had a rather simple set of tests for my sitemap checking functionality, without boring you with the details it looked a little like this:

@Test
public void testNoSitemap () {
 
	try {
		SiegeUtils.checkSitemap(TEST_DOMAIN_WITH_NO_SITEMAP);
		fail();
	} catch (IllegalArgumentException ex) {
		//passed
	}
}

You can imagine I had another one or two for a site that did have a sitemap, and I stupidly thought that was the end of it, no more ways it could fall over, there’s either a sitemap, or not, right? wrong, apparently.

So here’s a little checklist that you can run through to see if your sitemap is going to be test friendly:

1) Does it exist at all?

You need to actually generate one, go to Magento’s admin menu: Catalog -> Google Sitemap. Then if you do not already see one in the list, click add sitemap – call it sitemap.xml and put in / as the path. (if you run a mutli-store Magento from sub-directories, this is a little more complex, but I have a hunch you’ll know what I mean). Once you have saved and generated a site map it should now be available at:

http://www.yourdomain.com/sitemap.xml

2) If not, do you handle 404’s correctly?

This was a real tricky one – when a sitemap does not exist in 1) above, you’re supposed to send a 404 response code, right? I think magento does this by default, so I’m not sure how this person got it wrong, but the /sitemap.xml was just rendering the homepage as though it was the correct page.

My test assumed if it got a 200 response code, there was a sitemap, so I had to now change it to check that the actual file contents look like a sitemap in order to weed out these incorrect response codes.

It’s really really good practice not to mess with 404 handling in Magento, if a file doesn’t exist, actually send a 404 and render a humorous file not found page – it might go viral.

3) Are you redirecting cleanly?

This one is not so much just a sitemap tip as a general tip of the internet. If you got the memo about keeping all your traffic on one domain (either you.com or www.you.com) that’s great. But there are ways to do that nicely, and ways to really not do it nicely. So if you have a sitemap at /sitemap.xml but then redirect you.com/sitemap.xml to simply www.you.com (without preserving the path) then things will break. Not just sitemaps, but lots of your urls. So you should check that you’re keeping paths when you do the redirect.

That’s about it for tips right now – I’ll add more as I come across them.

Note: There is something going wrong with certain sub-directory sitemaps – I shall fix that up in the next few days and report back here when I do (so you should subscribe!).

Submitting changes in your sitemap.xml to Google automatically

Not related to testing as such, but quite by chance I wrote a sitemap submission extension for Magento ages ago, well before thinking up a web-based Magento performance testing tool. It’s relevant though because it can be set to automatically submit your sitemap to Google, Bing, and others. It’s a little out of date, so stay tuned for a newer release, but if you’re not on a bleeding edge Magento version, it should be fine.

I still suggest you submit your sitemap to Webmaster tools or it’s equivalent for each search engine, so that you can get some insight and metrics for how your site is looking to the search engines. But for pinging Google when there’s new products, it’ll do the trick.

That’s about it for now – check your sitemap.xml, submit it and look for crawl errors in webmaster tools are my top tips – I’ll update this with any new issues I spot over the next few months. Let me know if I missed anything and I’ll add it here.

2 thoughts on “Magento, Google XML Sitemaps and my Magento Speed Test

  1. Hi Schroder,

    This is an excellent article. I am facing an issue would like to know if you have any suggestions.

    We are building a multi-store (3 stores with each having one english store-view). Now if i have an URL of a product (say http://www.abc.com/audi-q5) which belongs to car-stores which works well if i am already in the car store. But now if i click on some other stores lets say toy-store and then paste the url http://www.abc.com/audi-q5 i am getting ‘page not found error’. But now if i switch to the ‘car-store’ and then paste this url it works fine. Now google-sitemaps are facing this same issue. For all the products in my default store (toy-store) it is able to find the page. but now when it tries to evaluate all other products in other stores like the car-store it gets the ‘page not found error’.

    now can you please advise me how to resolve this issue?

    thanks,
    dustin.

  2. Hi, Dustin – It’s only tenuously related to sitemaps but… oh well. Sounds like a complicated Magento multi-store setup you have there. I’d start by checking the base url for each store, it sounds like your browser is getting the store cookie set, but the stores all seem to share the same base url, which means that some of the urls will not exist in the different stores.

Comments are closed.