Myth Busting: A Page With LSI Keywords Outranks A Page With Higher Keyword Density

LSI vs Keyword Density
SIA Team
July 10, 2021

This test is based on the observation that ranking pages weren’t overtly repeating the keyword multiple times in the body copy. The prevailing idea from most SEO pros is to get away from using keywords and use more natural language. The idea is that Google is getting smarter and smarter, and is looking for content on pages that matches the search intent of the user, and not content that is based on keyword placement. 

The technical term for “natural language” is Latent Semantic Indexing or LSI. LSI is often confused to “synonyms”, however, they are different. LSI are words that would naturally come up in a conversation about a particular topic. For example, if you are having a conversation about kitchens, it would be natural to mention words such as stove, sink, refrigerator, pantry, etc. Those are LSI terms. Of course synonyms for target terms will also come up in natural conversations, so keyword variations were also used in this test, rather than just pure LSI terms. 

In order to get our natural language/LSI, we had to depart from our normal “lorem ipsum” content and used an actual term that would produce LSI that can be used. To do this, a keyword phrase was found that only had a few ranking pages but could produce LSI terms. 

For this test, the keyword that was chosen was for a local service in a remote area (house demolition bunbury). This keyword, when searched with quotes, returned just 4 results. The articles were both published on a Google Doc and were made public. Test article 1 had 600 words, while test article 2 had 604. The difference between the articles was that, instead of repeating the keyword “house demolition”, the second article used the keyword one time, and then used LSI keywords in the rest of the article. The LSI keywords were determined by searching “house demolition” using the keyword planner and then, picking appropriate variations. The test article repeated “house demolition” 7 times, whereas test article 2 used this keyword just once.

Results

The page that only uses the target keyword beats the page that uses the keyword 1 time, and then uses LSI and keyword variations. 

Conclusion

This myth is busted. The page that only uses the target keywords beat a page that uses the keyword 1 time, and then uses LSI/keyword variations. 

Keyword density is still an important ranking factor for an individual keyword. While LSI is definitely a ranking factor, it does not supersede the importance of getting the target keyword on the page. You will often hear pros talk about getting away from using keywords and just use LSI because of Hummingbird or RankBrain. What you don’t hear from those pros is that, you still need to help Google understand what the page is about, in the first place.

An important point to note is that, something that is not apparent in this test is whether the page with LSI keywords would rank for many more keywords.

Clint’s Feedback

In this video, Clint talks about this test. He also talks about keyword density and the use of LSI, and shares some examples.

This is test number 25 – LSI vs Keywords

The concept Latent Semantic Indexing is not new, it’s been around since way before 2016 did this test, but I think it was becoming more and more popular as more tools came out to address it. Text analysis tools, like, let’s say Ntopic would be a good example. There’s a tool inside of the WebSite Auditor for SEO power suite, there’s a bunch of them out there.

And basically what the concept is – let’s say you’re writing a topic about wooden pipes, the tools will go out and look at the pages that are ranking for wooden pipes and make a count of all the words that are in there, and see which ones are used more often. And then in so much as that, would say use these words in your content because these words are often associated with the target word that you’re trying to rank for. And so LSI

This test definitely has to get reviewed though, because now, we have super LSI for lack of a better word. We’ve got entities and entities are creating quite a buzz within the SEO community and the content people, machine learning AI, etc. You got Bert, you got Smith, and I’m sure that open AI, you got all these other things that are developing and looking at entities in a whole new way. And entities, ultimately, if you look at a lot of lists, you’ll see that a lot of are “LSI terms”. We’re already entities – persons, place, things, and events. 

We wanted to see if LSI, are you forming the topic around a keyword using Latent Semantic Indexing or super LSI entities would beat out keyword density in a head to head. Versus saying more or less about my keyword less times, I would say more by using LSI and keep my keyword density low. 

And the result was a keyword density win. The page that was optimized for the keyword beat the LSI optimized page. 

Reduced keyword density with increased optimization for LSI does not beat keywords. A higher keyword density just doesn’t work. So you have to figure that out when you’re writing your content. There’s a lot of different tools out there that look at the keyword density and they do it all from different perspectives. 

I’m going to just turn on SERPworkx here and you can see a keyword density point 01 with the SERP average at .034. 

Now, it’s not even entirely clear how they did it. Like I asked them one time and we’re looking at the entire page breaking that out, etc. But if you come over here, and just look at another keyword density tool, let’s just pick one for example. Keyword density checker. Keyword density trigger let’s just pick random one. Actually, let’s pick some extras. I see internet marketing ninjas. We’ll pick that one.

Alright, so we got this one page and copied the backlink, copied the URL. The density in SERPworx said point .034 or it’s actually one but I get keywords anywhere hiding it. Set .17 that’s the density .174 wooden pipes. Okay, so let’s look in here.

The top keywords frequency and I don’t want frequency and I want density. Two words, wooden pipes. This tool didn’t find any, from it looks like and I’m getting really pissed off with the multiple ads here so looking down we don’t see anything.

The keyword density would be zero according to that tool right? So let’s go to this. According to that tool, keyword density would be zero. This tool only gives you single keywords.

So the number one page has no exact match. So you see why this needs to be retested. It needs to be looked at. I bet you I can run it on Cora and find something else. Let’s see that. Why not, we’re here, we might as well mess with some of my tools, right. 

Cora Lite. I don’t know if it’s gonna bring that up while we’re doing this. Wooden pipe 2.96. Wooden pipes, there’s the plural. I would just add that together and say it’s 3%. So that was given and then this tool said it was not as I remember correctly. Let’s go Amazon and check that. Wooden pipes is not even there. It’s interesting. See. So it’s zero, according to that tool, three, zero.

Let’s see what this one says. Wooden pipes 1%.

Alright, so roughly, we want to probably be around 1.52%, wooden pipes. That’s what we’re looking for.

And then I’ll rud it. And we’ll just get to that and see what we got going on there.

The test results say that it was optimized for that page, it beat the LSI. So we gotta check that out and see if that is still the case, especially with the implementation of Google’s NLP API, and possibly Bert, etc, helping you analyze the pages and see and formulate topics, etc. So we got to figure that out and see if that’s still a thing, still a deal.

And also, what I just did there, you can see that each tool will provide you different results. I’m sure I’m going to get different results from the Cora tool here, once that’s done. So you have to take that into account and look around and find one that you trust. One that you know that is going to give you the most accurate results. 

Some things that mess that up is the header, footer, the sidebars, sometimes the ads, if there’s text ads in there, that kind of stuff. Comments will mess up your keyword density checker if you don’t include them. Because Google will see those comments and those actually, the more content you have, you will actually reduce your keyword density if your people are not commenting on that topic or using that keyword. So you’re reducing your overall keyword density for the entire page. 

But that’s assuming that Google is combining that and it doesn’t know whether you started your article at one point and your articles finished and another is Google taking that and because the comments are typically loaded within the body tag is added into it. That’s an interesting question that could use some testing as well.

Cora lite is still doing its thing. It might take a little bit because I’m recording at the same time we’re asking a lot of it on my computer at this moment, we’ll see, it’s almost done.

But basically what it’s doing is running through. If you’ve never seen this before, it’s running through the top 100. It’s comparing. It’s looking at what the competitors are doing and then it’s saying this is what you need to do to beat it. I could have given it a target or added a URL to it so I can compare it against if I wanted to. And at the end, when it’s all said and done, it’s going to spit out this nice report that you can actually save as an HTML file for your reporting, or whatever you want to do. So we’re waiting for that. It’ll give us entities that we’ll use and I believe there’s LSI reporting there. It’s a little bit. it’s not like the main Cora one. The main Cora software does all the LSI like ntopic would, etc. It does accounts and all that stuff. I’m not entirely sure if this one is doing it. But I know that this report is actually smaller. It’s not as built out as the main Cora one because the main Cora output is an Excel spreadsheet.

So typically, that can actually have hundreds in it, but the benefit of both of these is that they reduce that large word count and say here are the ones that you probably should use the most. 

Also, they’re useful if you’re combining all those reports and let’s say you have 10 of them and you’re ranking for I don’t know – let’s go with city web design. So you have 10 cities and you can find the yellow signs that are common to all of those and those are real popular and add those into your template as you’re building that out to try to rank for more stuff. Just to get a little bit closer so you don’t have to reinvent the wheel and examine each one individually in the beginning. 

We got to retest this one, we got to test the LSI without the entities in it intentionally and then we have to test the LSI with the entities and then I would say testing only entities but that doesn’t make any sense because it’s still LSI but we’ll knock all those out and that will be a good set of retest for the members of SIA. 

Curious about LSI? We have more tests on LSI, and keyword variations and matches, too! Check out are test articles and read up on them.