Episode 62 of Search Off the Record is all about what to do if your pages are not indexed. John Mueller started off by saying that there is a difference between pages that are not indexed and those that are just not ranking well.
Gary Illyes recommends going through Search Console to have a better idea of what is happening to your whole site and to see what pages are indexed, particularly in the page indexing report.
Another area in search console where you can check if an individual page is indexed or not is through the url inspection tool. Mueller mentioned he tells people to look at individual pages in the inspection tool because looking at a site overall is sometimes misleading compared to looking at individual pages.
If a site is pretty new and none of the pages are being indexed, it means that there might be something wrong on the site level rather than on the page level.
The usual issue when it comes to indexation is that there are sites that have a lot of content indexed and some of the content is not indexed, which is normal and is just the way it is.
If nothing is indexed then it usually points to other kinds of problems. New sites, especially the homepage should be simple to get indexed. If it doesn’t get indexed, then it definitely points to a bigger problem.
Illyes mentioned that Google’s system is built so that homepages like “domain.com” will get crawled and get indexed first. If it is not even crawled, which can be checked in Search Console or your server logs, then it definitely points to a bigger technical problem. Especially if that’s the first page that Google should crawl because they can’t possibly have an aspect of the quality of your content. Once you have a few pages indexed, and they can assert the quality of the content and how the internet reacts, basically, to your site’s presence on the internet, on the web, then they can make better assumptions about how the rest of the content is. But when you start off with a brand new site, and they can’t even get the home page indexed, that’s pretty much always a technical issue. Examples of such technical issues are in the robots.txt and noindex.
Mueller mentioned about how a site could have been submitted for removal and how they see people do it all the time – fixing a site by removing it. Another example is the site have been previously hosting so much spam that the web spam team took the whole site out. Another example given is how you could be looking at the www version of your site but the non-www version is the one that is indexed – you’re just checking it wrong.
With regards to how long it takes for a new site to get indexed, Illyes answered that it depends on how well a site is linked or a page is linked. It can get indexed within seconds. Google can index stuff very fast if they need to, like if they see that there is a spike in interest for something. Sometimes they can also take a while because it’s just not obvious that something should be indexed.
As an example, a PhD publication might be sitting somewhere on a web server. No one is interested in it so why would Google index it? It might just sit there unindexed because no one is looking for it.
There are also things in the internet that should not get indexed.
According to Illyes, unless you are publishing something utterly unique and something that people are actually interested in, it’s pretty hard to get stuff indexed. This is because the internet grew to a size where it’s basically not indexable in its entirety.
Back in the old days, you could find pretty much anything in the major search engines, be that Microsoft Live or Google, or even AltaVista. Nowadays, there’s just no reasonable amount of resources that would be enough to get everything indexed, so search engines will have to make a cut somewhere, which means that the expectations of getting stuff indexed should be lowered from the site owners’ perspective, or they can bump up the quality of the content and how interesting and viral they can write.
There is no indexing quota per site, according to Illyes. You could have your new content with your old content on the same site and get them all indexed, but Google would still try to make sure that they have space for other kinds of content on their index. This might mean that they will drop some content to make space for new things that other people might be interested in.
But if they see that people are continuously linking to the content, you are getting new organic (not paid) links, they might be interested in keeping the content indexed versus new content.
If you have a good website or a great website, then you wouldn’t need to go to that tool and submit pages individually. If you need a page indexed fast – for example, it is a breaking story and it is unique to you, then you could use it to get in the search index as fast as possible.
Illyes answered a resounding – “No”. The reason for this is because using the search operator only shows some of the pages that are indexed. This is because it is not feasible to provide all.
When asked why this is the case when search console can do it, why not “site:”, Illyes simply answered that search console can give you the number and how it is a completely different beast.
You could use “site:” as a rough way to check if anything is indexed and to check if you have keywords that you would definitely not want on your site. But Illyes is not sure he would use “site:” for much.
Bonus: According to the John Mueller and Gary Illyes, the right term is indexation and not indexing.
Interesting insights on indexation (especially the search operator!) from the search team. Check out the episode here.