Shortly after installing cookiebot on 3 websites we noticed that the error logs on all 3 sites were rapidly filling up due to "invalid session" errors.
After some further investigation, it turns out these were caused by the cookiebot spider(s) specifying a session id / name, rather than allowing the server to generate a valid one. This in turn caused an error (because the relevant session file does not exist on the server) which then got logged.
As these 3 websites are quite large (e.g. tens of thousands of URLs to scan on each site) there were tens of thousands of errors flooding the error logs, making them quite large!
As we use CloudFlare for these sites, I was able to ban the specified session name from accessing the site which has alleviated the flood of our error logs, but will also mean the spiders are now banned from our sites, which defeats the purpose of using cookiebot!
Also, thanks to CloudFlare data, we can see that the spiders access the site using an unknown browser and operating system, which makes the traffic look pretty suspicious, but more importantly makes it difficult to identify it, and potentially whitelist it.
Is there a reason your spiders use a fixed session ID? If not, please remove this so that I can unban your spiders from our sites!
Please sign in to leave a comment.