In the Google SEO Office Hours episode last February 18, a user reported that 97% of his crawls are refresh crawls, and only 3% are discovery crawls. He then asked if there is a way to optimize and let Google discover more pages.
According to John Mueller, there are no guidelines when it comes to the balance between refresh crawls and discovery crawls, and how to tweak them. In general, it is normal especially for an older and more established website to have a lot of refresh crawls. This is because Google look at the amount of pages that they know about and that number grows over time, while the amount of new pages that comes in tend to be quite stable. It is pretty common for established sites to have a refresh and discovery ratio to be around that percentage since most of the crawling is on the refresh side and not so much on the discover crawling.
It would be different for short-lived websites like classified sites or local news sites where there are a lot of new articles or content that come in and the old content becomes irrelevant very quickly. On these types of sites, the focus would be more on the discovery crawls.
E-commerce sites tend to have more refresh crawls than discover crawls also, as the amount of content grows slowly and the other content remains valid.
Looks like it the percentage of refresh crawls and discovery crawls really depends on your type of site and your content, and there is no way to optimize in order to increase your discovery crawl rate.
It is best to take a look at your site and what the normal ratio between discovery and refresh crawls would be, depending on your type of content and how often content is published.
If your site has low discovery crawl rates and some pages remain not indexed, check out our article on Mueller’s tips on getting sites crawled and indexed.
Check out the SEO Office hours episode here: