The Cookiebot scanner crawls your website to identify all cookies and similar tracking technologies. In order to do so, we will follow all links which a user could use to navigate your website. It does this to ensure that all tracking on your website is detected, since not all cookies are set on every page. Furthermore, tracking may occur on parts of your website that you yourself are not aware of. By scanning every single page on your website we ensure that your visitors can give informed consent.
Unexpected scan results
The Cookiebot scanner is unable to determine whether a URL points to a page it has already scanned, if the URL is unique. We refer to multiple unique URLs which point to what is essentially the same page as "canonical URLs". This is for example the case where information is passed to the page via the URL, but where this has no influence on which cookies are served on the page.
If the number of pages vastly differs from what you expected and you suspect that the scanner repeatedly scans the same page due to canonical URLs, there are 2 options to combat this:
1. Avoid canonical URLs on your website
You can avoid canonical URLs on your website by avoiding certain addons or choosing alternatives.
For example: Some calendar addons generate virtually endless URLs due to a "next day" link.
WordPress
If your website is built in WordPress, a look at your uploaded assets can be a good place to start. These typically get their own "attachment page" with a unique URL which is rarely needed.
You may consider using a plugin like Yoast to change the behavior of WordPress in this regard.
See more about attachment pages here: How to disable image attachment pages in WordPress
2. Setting up a filter for the scanner
Global filters
If you run a webshop we recommend using standard parameters for filtering, sorting, view, pagination and the like. Use parameters like the following:
limit, dir, sort, sort_by, order, order_by, filter, filter_by, mode, page
These standard parameters should be caught by our global filters for most well-known E-commerce platforms, such as Shopify and Woocommerce.
Domain(Group) specific filtering
We do not set up filters to ensure you are eligible for a free subscription.
If your webshop creates dynamic URLs that are not caught by any of our global filters, we can also create a filter unique to your website and the URL patterns it creates. In order to set up this filter, the duplicate URLs need to be distinguishable by a clear pattern.
An example of this would be a website about apples, with a product page where you can sort by different metrics. The page essentially serves the same content (albeit in a different order), but the URLs are unique, making the scanner think they are unique pages.
In this example, a filter is set up to exclude canonical URLs that end with "?sortBy=", followed by "name", "color", or "price". We will scan the first URL matching listed below and will ignore additional URLs matching our filter.
For example:
- ✓ apples.com/product
- ✓ apples.com/product?sortBy=name
- ✗ apples.com/product?sortBy=color
- ✗ apples.com/product?sortBy=price
If you like to check for yourself which URLs have been scanned, you can download the URL list via the "Cookies" tab in the Cookiebot Manager.
It is therefore of vital importance that you compare scan reports of before and after a filter was applied.
Comments
0 comments
Please sign in to leave a comment.