The Cookiebot scanner scans all the HTML content that a website user can access. This includes both static and dynamic pages, blog posts, images, embedded videos etc.
In addition, the scanner also scans any content on your website that you yourself link to. For example, if your sitemap links to a 404 (not available) page, then the scanner will scan that page as well. For each scan, a URL list is provided.
How can I see the URL list of the pages the scanner has found?
- If you have received a price quote from Cookiebot: attached to the email is a URL list of up to 5,000 of the subpages found by our scanner.
- If you have received an email saying your account no longer meets the criteria for a free subscription: attached to the email is a URL list of up to 200 of the subpages found by our scanner.
- If you have signed up for a Cookiebot subscription: you can find details and an URL list with up to 10,000 of the subpages that our scanner has found on the ‘Cookies’ page: Log into your account (Cookiebot CMP manager); choose the top menu point ‘Cookies’. When you click on the number of pages identified by the scanner, the URL list of subpages will download.
Does Cookiebot include pages (and cookies) behind a log in?
Our scanner simulates a number of website users who visit your website and perform all the actions that can be performed (clicking on all links, scrolling, accessing all pages, playing any embedded videos etc.). The only thing the scanner does not do is fill in forms (like sign up to your newsletter) and pay for the items in a shopping cart (the scanner will fill the shopping cart with items but will not go through with the payment). If you have an area of the website behind a log in or a cookie wall, we can configure the scanner to also scan behind this. Please provide us with the credentials to set up the scanner (username and password) via the support form.
Why does the Cookiebot scanner include old test pages that are no longer displayed?
The Cookiebot scanner only finds pages that are publicly available (see above) and can be accessed by a website user. You may have test pages and old content that is not actively displayed on your website but has not been unpublished or removed. You may also still be linking to such content.
If you have a Cookiebot account, log in and go to the menu point ‘Cookies’. Click on the hyperlink with the number of subpages on your domain. A URL list will download. On this list, you can check the second column, which lists ‘FirstParentURL’ so you can check, where a link to the page was first found.
Does the Cookiebot scanner scan 404 pages on my website?
The Cookiebot scanner is set up not to include 404 pages in the scan. If you link to those pages, however, this will overrule the general rule, and the scanner will include those 404 pages in its scan.
The primary reason for this is that some websites contain 404 pages with content where cookies and trackers can also be in use (primarily due to the technical setup not being optimal).
How come a ‘site:’ search in Google shows a different number of subpages?
There are multiple reasons for this. Some of them are that Google also indexes PDFs, Word and Excel documents and other attachments. Cookiebot only scans HTML pages because those are the pages that can set cookies and online trackers.
Google does not scan all pages all the time and their index is therefore not necessarily updated - the Cookiebot scan provides a ‘here and now’ picture of your website.
Also, Google may deal with the indexing of dynamic pages in a different way.
How do I know how many pages my website has according to Cookiebot - and what the subscription price will be?
If you would like to make sure how many pages (URLs) your website has before signing up for a subscription, you can request a free price quote (which will include the number of subpages, a URL list of up to 5,000 subpages identified and a price quote): Quote tool
Where can I find the exact URL list discovered by the scanner?
- Log in to your Cookiebot account
- Navigate to the "Cookies" tab
- Select the correct domain name (if you have multiple domains added in the domain group)
- Click on the link with the number of pages (if this info does not show on your account, try and reload the page again).
Please also see:
Why does your scanner say that my website has more than 50 pages?
Comments
1 comment
Thanks for providing this detailed information about Cookiebot and how it scans website content. It's helpful to understand the scope of the scanning process and how to access the URL list of pages that the scanner has found. Your explanations about including pages behind logins and the handling of 404 pages are particularly informative. This guide will surely assist users in navigating Cookiebot effectively.
Please sign in to leave a comment.