What is keyword stemming?

Google can read and analyze texts very well. It understands that walk, walks and walking all boil down to the same thing. Also, Google knows that baby is basically the same as babies. How? By using something called keyword stemming. This gives you the freedom to alternate between different word forms while writing your text. And that’s why we introduced word form recognition in Yoast SEO Premium. So you can optimize your post and we’ll recognize the different word forms like walk, walks and walking. For longer tail keywords, we also recognize the words when you use them in a different order.

So, at Yoast, we talk about word forms, sometimes also about morphology recognition. At the same time, I hear the linguists at Yoast talking about keyword stemming too. And I noticed some SEOs talked about it as well. But what is keyword stemming? How does stemming relate to morphology recognition? And what does it have to do with SEO? I’ll explain all about it in this post.

What is keyword stemming?

Stemming or keyword stemming refers to Google’s ability to understand different word forms of a specific search query. It’s called stemming because it comes from the word stem, base or root form. To give an example: if you use the word ‘buy’ in a sentence, a stemming algorithm will recognize the words ‘buys’, ‘buying’ and ‘bought’ as variations of the word ‘buy’ as well. Some SEOs also differ between stemming and lemmatization.

Google has used keyword stemming in its algorithms for a long time now. The first blog posts about it from SEO experts like Rand Fishkin and Bill Slawski go as far back as 10 years ago. For languages other than English, Google began recognizing word forms much later. In recent years, Google’s algorithm became even more advanced, making exact match keyword optimization more and more outdated.

If you want to optimize your text for the term ballet shoes, for example, you should be able to use the term ballet shoe as well. Google understands that ballet shoes and ballet shoe are basically the same thing. Our Yoast SEO Premium plugin also recognizes these different word forms in the following languages: English, German, Dutch, Spanish, Polish, French, Russian, Italian, Indonesian, Arabic, Portuguese, Swedish, Hebrew, Norwegian, Turkish, Czech, Slovak, Greek and Japanese. We’re working hard on adding new languages to this list, so let us know which one you’re missing!

Stemming and word forms

When people talk about keyword stemming or a stemming algorithm, they mean that the algorithm is able to recognize different word forms of a certain keyword. That’s exactly what the word forms functionality in Yoast SEO does. In regard to synonyms: we do not automatically detect these, but we do allow you to enter synonyms which will then be taken into account in our SEO analysis.

Maybe we should have called our word forms functionality stemming. But it’s a difficult word to explain to people. So, we’ll stick with word forms.

Keyword stemming and SEO

Google has become very smart. It understands text. It understands context. In order to stand a chance in the search engines, you need to write awesome texts that show your authority on a certain subject. Content stuffed with keywords does not rank anymore. Google hates that and your users hate that.

You need to use synonyms and related keywords in your content to make it pleasant to read and to make it rank! You also need to use different word forms in order to write a post that is easy to read. Thanks to stemming, we can tell that they belong together. Read more about it in our post on our word form analysis.

Conclusion

The SEO industry has been talking about stemming and lemmatization for over a decade. Our linguists talk about it too. For good reason, because keyword stemming allows them to recognize different word forms. This isn’t easy. At Yoast, we have an entire team of linguists working on our SEO and readability analyses. We’re now able to recognize different word forms properly in English, German, Dutch, Spanish, Polish, French, Russian, Italian, Indonesian, Arabic, Portuguese, Swedish, Hebrew, Norwegian, Turkish, Czech, Slovak, Greek, and Japanese. And we’re already working on new languages, so tell me: which language should we tackle next?

Read more: SEO copywriting: the ultimate guide »

Discussion (32)

  • I just realized I’ve been using this plugin wrongly
    For synonyms I just put the next most used word in the article.
    Thank you for this article.

  • Id like to vote for Portuguese for a future language. The language of Brazil and so many Brazilian immigrants in the southeastern US.

    • Hi Doug, that is definitely a good suggestion. Now I can’t make you any promises as to when, but Portuguese is on our list! :)

  • Dear Yoast,
    Can you please tell me what is keyword stuffing. Keyword Stemming was understood. Thanks for publishing quality content for Us.
    Regards

  • Thanks, this article was very helpful. It improved my understanding of search engine algorithms a lot.

  • Hi guys, wondering if you consider keyword stemming to be the same as LSI, latent semantic index. In other words, do you think Google views keyword stemming as a way to show related keywords or is it simply looking at each iteration of the keywords as something separate and distinct. I always recommend clients focus on only 1-2 keywords per page. Do you think keyword stemming helps that by offering variations of the same keyword? Thanks.

  • I am fan of the Yoast pulgin, i just want to know can i use it for one of my website made in .Net techonoligy.

    Thanks.

  • While interesting to learn what keyword stemming is, this post comes across as vague because it does not indicate what needs to be done on a web site to deliver optimal SEO.

  • Sanskrit and its English Forms. You know Padmasana and Lotus Poses are the same. For optimizing, I am in doubt, to which one I should go for. I suppose I should optimize for the keyword that has more search queries. Search intent is the same for both keywords. Does Google recognize it?

  • That’s right. Totally agree with this.

  • Sentences are framed using words. a close analysis of a sentence can see 1/3rd of its words are repeatisng. mostly in the current worlds socalled seo optimised articles, it try to stuff keywords without saying it is stuffing. when a person is trying to obtain some information, the article take the person to the entire history of the particular topic. even if it is in the name of SEO i dont think it is advisable. what is your opition? how this can be improved to a reader friendly format of SEO?

  • Danish for stemming initiatives :)

  • Thank you! This post clarifies a few things for me. I really like your plugin.