In the Google SEO Office Hours hangout last January 7, John Mueller was asked about the crawling of a site and how the site owner has noticed that when he published content regularly, his site was crawled daily, but when he published less often, the crawling also became less often.
Mueller confirmed that such instances could happen and that it is not so much that they crawl a website but instead, they crawl individual pages of a site.
When it comes to crawling, they roughly have two types of crawling – a discovery crawl where they try to discover new pages on a site, and a refresh crawl where they update existing pages that they already know about.
For the most part, they would refresh crawl the homepage, for example, once a day or every couple of hours and if they find new links on the homepage, then they will go off and crawl those new pages with the discovery crawl.
Because of that, you would see a mix of discover and refresh happening when crawling, and you’ll see some baseline of crawling happening every day.
But if they recognize that individual pages change very rarely, then they realize they don’t have to crawl them all the time.
As an example, he provided a news website which is updated hourly, they will learn that they need to crawl it hourly. If the news site only updates once a month, then they will learn that they don’t need to crawl every hour.
He also clarified that this is not a sign of quality or a sign of ranking, it is really just from a purely technical point of view.
From his answer, looks like how often content is published has a connection with how often a site or page is crawled, and based on his answer, having fresh content in the homepage can help make it faster to discover, at least that’s what I would assume from his answers.
However, how often a site is crawled has no direct correlation to quality and rank and is purely technical, as he mentioned.
The question now is, how will you take advantage of these information, when it comes to getting your pages ranked?