In the Google SEO Office Hours last February 18, a user reported that his crawl stats experienced a huge drop, from getting 700 per day to just 50 per day. He then asked if there is a way to understand from the search console report what could have caused such a drop in crawl rate.
According to John Mueller, there are a few things that go into the amount of crawling that they do. On one hand, they try to figure out how much they need to crawl from a website to keep things fresh and useful in the search results, and that relies on understanding the quality of the site and how things change on the site. This is what they call the crawl demand.
On the other hand, there are also limitations seen from the server-side of the site, from the network infrastructure, with regards to how much Google can crawl on the website. They try to balance these, also.
The restrictions tend to be tied to two main things – the overall response time to requests to the website and the number of errors, specifically, server errors that are encountered during crawling.
If Google sees a lot of server errors, then they will slow down the crawling because they would not want to cause more problems if they see that the server is getting slower. They will also slow down crawling because they do not want to cause problems with their crawling.
Those are the two main factors that come into play.
When it comes to the speed aspect, Google has two different ways of looking at speed and sometimes, that gets confusing when you look at the crawl rate.
Specifically for the crawl rate, they just look at how quickly can they request a url from the server.
Another aspect of speed that you can probably run into is around the core web vitals and how quickly a page loads in a browser. The speed that it takes in the browser for a page to load tends not to be related directly to the speed that it takes for crawlers to fetch an individual url on a site.
This is because in a browser, you have to process the Javascript, pull in all the external files, render content, etc., and all these takes a different amount of time than just fetching the url for a crawler.
That’s one thing to watch out for if you are trying to diagnose a change in crawl rate – don’t look at how long it takes for a page to render, but instead, look at purely how long it takes to fetch that url from the server.
Another thing that comes into play from time to time, depending on your site, is Google tries to understand where the website is actually hosted, in the sense that if they recognize that the website is changing hosts from one server to a different server, moving to a different hosting provider, changing cdns, moving cdns, Google’s system will automatically go back to a safe crawl rate where they know that they are not going to cause problems.
This means that anytime there is a big change in a site’s hosting, the crawl rate would drop, and then would normalize in the next couple of weeks.
Another thing is that from time to time, in order for the algorithms to determine how they can classify websites and servers, they can update crawl rate as well. It can certainly happen that at some point, even if no changes were done in the hosting infrastructure, the algorithms will try to figure things out and make adjustments.
One thing that can be done in Search Console is to specify a crawl rate. Setting a crawl rate helps Google understand that you have specific settings for your website and they will try to take that into account.
The difficulty with the crawl rate settings, however, is that it’s a maximum setting. It’s not a sign that they should crawl as much as the set crawl rate, but rather, that they should crawl at the most to what was specified. This means that the setting is more useful for times when you need to reduce the amount of crawling, and not really when you want to increase the amount.
If the issue with the crawl rate still persists, the last thing that can be done is to report the problems with Google bot in the Search Console Help Center. If you notice that the crawling of your site is way out of range for what you would expect, then you can report the problems with Google bot. You would just need to specify the IP addresses of Google bot when it tries to crawl your page and provide some information on the type of issue that is being encountered.
The reports go to the engineering team that works on Google bot and they go through these reports to figure out what tweaks are needed on their side to improve crawling.
For the most part, you would probably not get a reply to the request, but they will be read by the team and they try to figure out if there is something specific for the site or if it’s an overall system improvement.
Useful insights on factors that come into play when it comes to crawling. Hopefully, Mueller’s answers have provided answers on what could have caused the drops you’ve experienced, and what could be done.
Check out the SEO Office Hours episode at: