Google has released a podcast on the topic of “crawl budgets” and the factors that affect how Google indexes information.
Martin Splitt, Google Developer Relations, and Gary Illyes, Google Webmaster Trends Analyst both provided their perspectives on indexing the web from Google’s point of view.
According to Illyes, the idea of a crawl budget was developed by the search community independently of Google.
“Because people were talking about it, we tried to come up with something… at least, somehow defined. And then we worked with two or three or four teams– I don’t remember– where we tried to come up with at least a few internal metrics that could map together into something that people externally define as crawl budget,” Illyes said on the podcast topic ‘Should I worry about crawl budget?’ of Google Search Central YouTube Channel on August 25, 2022.
While Illyes claims that practical factors like the number of URLs that the server will permit Googlebot to crawl without overwhelming the server form a portion of the calculation for a crawl budget.
Illyes said it was the blogs in the search business that had in the past spread the idea that the crawl budget was something to be worried about, but he claimed, it was not.
“I think it’s partly a fear of something happening that they can’t control, that people can’t control, and the other thing is just misinformation. I think most people don’t have to worry about it, and when I say most, it’s probably over 90% of sites on the internet don’t have to worry about it,” Illyes said.
Illyes and Splitt pointed out that because of its constrained capacity, Google is unable to index every page on the internet and since Google cannot crawl everything, it is crucial to be selective and only index the most relevant stuff.