Friday, November 25, 2005

Custom error pages, 404 header response and Google sitemap XML

A custom 404 page is important. You don't want to loose visitors off your site and an user friendly error page, one which quickly re-enpowers the user, is key here.

It's also key to ensure that your "page not found" message actually returns the 404 header response. At the end of 2004 in Vegas, Yahoo banged on and on about this in a presentation to SEOs and webmasters.

The 404 response is also key if you wish to make use of the reporting capabilities of Google's sitemap XML project.

The issue is that if you use Apache and an ErrorDocument command that you'll wind up with a 302 response - that might actually be correct as the server is redirecting the user agent to the custom 404 page.

An .htaccess file might look like:
ErrorDocument 404
The killer catch is that it's not possible to tweak Apache or use PHP to actually get requests to non-pages ( for example) to actually return the 404 header.

The good news is that there is a compromise which Google accepts - and this compromise is good enough to get your sitemap XML verification file accepted.

With PHP you can have your custom error page issue a 404 header before showing you any HTML. That's fairly easy:
header("HTTP/1.0 404 Page not found");
at the start of the page in a chunk of PHP.

If you examine the header of the custom error page (404e.php in my example) then that does correctly return the 404 header.

Google is following the 302 to the custom error page and then checking the header off that error page. This is really what you'd expect Google to do as it's the only way to deal with two or more redirects in a chain.

The good news is that wikis and sites optimised to have SEO friendly URL structures can take part in the rather handy sitemap XML.

Now if Google would just set up the vanity URL for the lazybones among us - I'd be happy.