Search console parameter settings and the impact on indexation
Crawl optimisation (ensuring search engine spiders crawl all unique content across a domain) is one of the most complex aspects of technical SEO. The following article and research explores how Google Search Console parameter settings (one of several potential methods) is being used by webmasters and SEO experts to guide how bots crawl their domain’s URLs.
There are other methods for controlling crawl budget including canonicals, XML sitemaps and robot directives, but in this article we are going to specifically focus on understanding how Search Console parameter settings impact on search engine crawl.
The launch of parameter settings
Google introduced the latest version of parameter settings in Webmaster Tools in mid-2011 (Webmaster Tools was rebranded to Search Console on May the 5th 2015 ). According to Google, the parameter handling “helps you control which URLs on your site should be crawled by Googlebot, depending on the parameters that appear in these URLs.”
Kamila Primke , the author of Google’s article explaining the change listed the following benefits of using parameter handling in Search Console:
It is a simple way to prevent crawling duplicate content
It reduces the domain’s bandwidth usage
It likely allows more unique content from the site to be indexed
It improves crawl coverage of content on site
While Primke’s piece highlighted the most important benefits and features, it did not go into detail about how different variations of parameters can be configured. To clear up some of the confusion, Maile Ohye later recorded a video providing guidance for common cases when configuring URL Parameters.
We have prepared the following visual guide based on Ohye’s video and our own experience.
Essentially, the user needs to decide whether the parameter is used for tracking or not and then tell Google how the parameter should change the page’s content. This offers users the following options to handle URLs with parameters:
Only crawl representative URLs – the only option available when parameters are used for tracking.
Crawl every URL – this is a common recommendation when parameters are used for pagination, the new pages are somewhat useful or if the parameters replace the original content with manually translated text.
Crawl no URLs – this is a common recommendation when parameters create pages with auto-generated translation, sub-par sorted and narrowed lists, or the URLs are simply not intended to be indexed, for instance in the case of parameters triggering pop-up windows.
Let Googlebot decide – this setting is a recommended solution when sorting and narrowing parameters are used inconsistently, for example when sales pages load with low-to-high price sorting by default while premium product categories are setup with high-to-low price order.
Only crawl URLs with value X – this one is a recommended solution when sorting parameters that are used consistently, such as low-to-high price listing implemented across all category pages the same way.
If the webmaster believes there might be architectural problems on the site, “Let Googlebot decide” is the preferred option.
Following an entire year of running tests across a large scale, international ecommerce website, we have made several key observations on parameter settings and the consequent impact on search engine indexation.
To read these findings, please complete the form below.