How to semantically link entities to content

Search engines love entities. Entities can be people, places, things, concepts, or ideas and they will often appear in the Knowledge Graph. Lots of search terms can be an entity, but specific search terms can also have different meanings and thus, be different entities. Take [Mars] for example; are you talking about the planet entity or the candy bar entity? The context you give these entities in your content determines how search engines see and file your content. Find out how to link entities to your content.

Let’s talk semantics

Semantics is the search for meaning in words. In theory, you could write an article about Mars without ever mentioning it directly. People would understand it if you provide enough context in the form of commonly used terms and phrases. To illustrate this, we’ll take the keyword [Mars]. Mars is a so-called entity, and search engines use these to determine the semantics of a search.

If you search this term on Google, you’ll most likely get results about the planet Mars. But why? Why isn’t the Mars candy bar in the top listings? Or Mars the chocolate company? Or the discovery district MaRS in Toronto? Maybe the Japanese movie called Mars? Or one of the many Mars-related movies made over the years? This is because Google makes an educated guess using search intent and your search history. Also, it uses co-occurring synonyms, keywords, and phrases to determine which page is about one of these specific search variations and which ones to show.

Co-occurring terms and phrases

Co-occurring terms and phrases are those that are commonly used to describe an entity. These are the terms that are most likely to pop up in content about that entity. Content about the planet Mars will probably contain mentions of the following terms:

  • ‘red planet’
  • ‘northern hemisphere’
  • ’low atmospheric pressure’
  • ‘martian craters’
  • ’red-orange appearance’
  • ’terrestrial planet’
  • ’second-smallest planet in the Solar System’
  • etc.

Pages with Mars candy bar content might feature phrases like:

  • ‘chocolate candy bar’
  • ’nougat and caramel covered in milk chocolate’
  • ’limited-edition variants’
  • ’ingredients’
  • ‘nutritional information’
  • etc

While content about the 2016 Mars movie will probably mention its main protagonists Rei Kashino and Makio Kirishima.

All these words are co-occurring keywords and phrases. It’s a type of content that is semantically related to the main keyword, but that doesn’t contain the keyword itself. This might include synonyms but often expands on that because they clarify the knowledge of the term, instead of saying the same thing differently. Search engine spiders scan your content for these related terms to paint a picture about the nature of your page. This way, it can correctly index the page, ie. file under [planet Mars], not [Mars the candy bar].

Optimize for phrase-based indexing

Over the years, Google was awarded several patents that suggested the development of a phrase-based indexing system and systems using word co-occurence to improve the clustering of topics. This is information retrieval system uses phrases to index, retrieve, organize and describe content. By analyzing the context surrounding an entity – meaning all the phrases that are commonly connected to an entity – Google can truly understand what a piece of content is about. That might sound complex, but it is something you can optimize for. And you are probably already doing that – to a certain extent. First, do keyword research. After that, provide context in your articles.

When writing about an entity in your content, it makes a lot of sense to give search engines – and readers for that matter – as much context as possible. Use every meaningful sentence you can think of. This way, you can take away any doubt about the meaning of your content.

If your subject is the planet Mars, you need to take a look at the Knowledge Graph in Google. Scour Wikipedia. Find out what kind of common terms and phrases co-occur in search results and incorporate them into your content so you can give your term the right context. Also, run a search and open the sites of competitors that rank high for your search terms. What are they writing about and how do they describe the entity? What terms and phrases can you use in your content? By doing this, you’ll find out that there will be much overlap with what you had in mind, but there will be many new – and maybe better – nuggets for you to use.

One more thing: no LSI keywords

Recently, the term LSI keywords started to pop up again as a magical way to play into one of Google’s ranking factors. They are not. Yes, you have to provide search engines context. No, latent semantic indexing has nothing to do with it. There’s no evidence whatsoever that search engines have ever used latent semantic indexing to determine rankings. LSI was a document analysis patent from the 90’s that only seemed to work on a limited set of documents, and it has no place in SEO.

Read more: Keyword research for SEO: the ultimate guide »

25 Responses to How to semantically link entities to your content

  1. nexvan
    nexvan  • 2 years ago

    Is linking a few words useful in all content?
    Please reply

  2. Rajesh Magar
    Rajesh Magar  • 2 years ago

    Thanks for another informative article, I do agree on you toughs on LSI keyword scenario, in fact recently Bill has also cover up the same topic. Please allow me to share the link as it might help every reader.

    My personal take on LSI keyword is that. I feel it just bled of co-occurring keyword as well, so following with good care wouldn’t hurt anyone’s SEO.

    • Edwin Toonen

      Hi Rajesh. Thanks for your comment. Indeed, Bill has been going on about this for quite some time. And he’s right, of course.

  3. sriya
    sriya  • 2 years ago

    This article clearly said that how semantically we can link entities into content, it is really informative. Thanks for sharing.

  4. Charles Lowe
    Charles Lowe  • 2 years ago

    Edwin – if LSI has no relevance (except enriching context) to any Google ranking factor, would you please explain why the terms “driving school”, “driving schools”, “driving lessons” & “driving instructors” are completely interchangeable as far as (my experience of) Google search is concerned?

  5. Saepih
    Saepih  • 2 years ago

    True true true! cannot agree more.

    Entities could be anything, and by defining each entity this way search engine would understand better about the content in that page.

  6. Noor Alam
    Noor Alam  • 2 years ago

    I think this is the first time I have heard of LSI keywords. A little more explanation would be helpful.


  7. Techioguy
    Techioguy  • 2 years ago

    Can I link to a non relevant content site

    • Edwin Toonen

      Why would you want to link to a non-relevant site?

  8. nexvan
    nexvan  • 2 years ago

    Can a non-linked content link be useful?

  9. Paul Redfern
    Paul Redfern  • 2 years ago

    We have been using “phrase based indexing” content audits on our sites for a while now and have seen increase in traffic to audited pages up by 30%. It may seem cooler to perform new keyword research to build out new categories on your sites but auditing your existing content first can bring surprisingly quick results!

    • Edwin Toonen

      You’re right, Paul! There so much to gain from optimizing your current content. Providing relevant context to your keyword is another way to improve your content and to make it clearer for both users and search engines.

  10. Yuri Moreno
    Yuri Moreno  • 2 years ago

    Great article Edwin. Also worth mention that different search intents will have different co-phases with the core entity. This is something that worth exploring during the keyword research process.

    • Edwin Toonen

      Great addition, Yuri. Search intent determines everything.

  11. drone
    drone  • 2 years ago

    Thanks fot the interesting article, have given me new inspiration

  12. קידום אתרים בגוגל
    קידום אתרים בגוגל  • 2 years ago


  13. titou
    titou  • 2 years ago

    The phrase based indexing patents do seem to apply to incoming anchor text as well. It appears that, if anchor text pointed to a page might contain a “related phrase,” (one that tends to co-occur with other pages that might rank well for the same query), then it might be given more weight than anchor text that doesn’t.

    • Edwin Toonen

      Yeah, it’s much bigger than just related phrases. Among other things, it seems that anchor texts and the distribution of related terms throughout the page count as well.

  14. jivansutra
    jivansutra  • 2 years ago

    Hi Edwin, thanks for letting us know about the importance of entities. But i think you have somewhat underestimated the potential of LSI keywords when it comes to top search rankings. I have seen myself some websites with poor content, achieving top positions with these KWs in search results. I don’t know in detail as how it happens actually, but i think it works, however not much often…

    • Edwin Toonen

      Hi, thanks for your comment. Sure, related keywords and phrases work, but not in regards to the latent semantic indexing context everybody’s talking about.

  15. Mark Koller
    Mark Koller  • 2 years ago

    Clearing and cleaning stuff up to 2018.
    A well thought over written piece about content,context and more. Thank you

    • Edwin Toonen

      You’re welcome, Mark. Thanks for the compliments.

  16. Andrew Randazzo
    Andrew Randazzo  • 2 years ago

    From what I’ve read, I don’t understand the difference between a LSI Keyword and a occurring phrase/term. Can you give an example of the difference?

    • Edwin Toonen

      Hi Andrew. Co-occurring phrases and terms are a great addition to your SEO toolkit, but LSI has nothing to do with it. That’s the gist of it.

  17. Phobiaanxiety
    Phobiaanxiety  • 2 years ago

    i Am Beginning To Think This Article Will Help Me See SERPs In A Whole New Light!.