“We Have Millions of On-Site Search Results Indexed. Should We Clean Them Up?”

Is the number of pages on your site inflated by indexed on-site searches? You may want to watch this.
SIA Team
October 7, 2021

Before I get into this, I just want to be sure we know what an on-site search result is.

Some sites have search boxes where you can do a search on the site. 

I’m sure you’ve seen these. It’s almost like doing a Google search, except it’s limited to the content available on one site. 

Anyway, some site owners allow the results pages of these searches to be indexed. So, presumably, a visitor would do a search, and be taken to the site’s search results page. That results page usually has a URL parameter that makes it different from most other URLs, and it’s that URL parameter that gets indexed. 

So, over time, depending on how much traffic a site gets, how big it is, and how active the users are, this can lead to the accumulation of many, many indexed search URL parameters. 

Also, we know that Google tends to favor unique content and also, quality pages. 

With that in mind, let’s now turn to one of the more recent Q&A sessions that Google often holds for webmasters and SEO professionals. 

During the English Google SEO office-hours from October 1, 2021, John Mueller, who’s a Google Search Advocate, was asked the following: 

“I had a question in regards to internal search pages. So, we’re allowing indexation of on-site searches; so sometimes, someone does a search on our site, we create a page for that and now that’s gone a bit out of control, so we have hundreds of millions of these pages. 

“So how would you recommend we sort that out and if there is actually any benefit to cleaning that up or if we shouldn’t worry about it.” 

This video is queued to ~52:26, which is roughly where the question was asked. 

“I think for the most part it does make sense to clean that up because it makes crawling a lot harder.

“So, that’s kind of the…direction I would look at there, is to think about…which pages you actually…want to have crawled and indexed and to help our systems to focus on that–not so much that like you should get rid of all internal search pages (some of these might be perfectly fine to show in search) but really try to avoid the situation where anyone can just go off and create a million new pages on your website by linking to random urls or words that you might have on your pages.”

John was then asked how to do this. He recommended simply adding no-index to the pages (URLs with search parameters) you wouldn’t want to be indexed. 

But, he did say that there may be some of those pages you’d want to be indexed, so keep that in mind. 

John also talked about quality, which makes sense. 

As you may know, a lot of these internal search results pages (URLs with search parameters) probably aren’t of high quality. And, these pages probably display content from other pages on your site, so, in a way, they’d be duplicate content. (Not that duplicate content is bad, it’s just…redundant.) 

Potential SEO Benefits of Removing Low-Quality Pages

John was then asked something that is of concern to those who work with businesses: “…businesses would argue, you know, ‘If we’re going to do this work as investment from our side, what is the return on this? Is there an SEO benefit to that?”

John’s response was basically affirmative, though you might not see an SEO boost right away.

One aspect of this has to do with crawl budget (although I didn’t hear John specifically use that term).  Basically, if Google didn’t have to crawl as many pages (specifically those URLs with search parameters), Googlebot would have more budget allotted to the pages you actually do want crawled and indexed. 

(One example John mentioned was of e-commerce sites that have price changes. If you publish a new price change, you want Google to know that as soon as possible.) 

So, if you have a significant number of your indexed URLs that are just these low-content search results pages, consider de-indexing them. Also, what I mentioned above may not apply equally to all sites, so take your specific situation into consideration. 

Source: Google Search Central YouTube channel