How we built the inclusive language analysis in Yoast SEO

Yoast SEO now comes with a new analysis to make your content even more accessible to all: the inclusive language analysis. This analysis helps you write more inclusively, lowering the likelihood that you’ll exclude someone from your content. That means you can reach a wider audience. But how does this analysis work? And how was it developed?

Before we get started, let’s provide you with a proper definition of what inclusive language is. At its core, inclusive language helps identify and avoid terms that could exclude marginalized groups of people. Typically, these are terms that perpetuate prejudice, stigma, or erasure. More inclusive language prefers alternatives that are less likely to be experienced as harmful or exclusionary. Want to learn more about inclusive language, then check out this blog post on what is inclusive language?

Let’s meet the people involved

Most of the work on the inclusive language feature was done by a team of three people – two Yoast linguists/developers with a passion for inclusive language, and an external scientific advisor, Maxwell Hope. Maxwell is a Ph.D. student at the University of Delaware and his research focuses on the linguistic practices of non-binary people. For example, his research topics include the use of non-binary pronouns and gender perception in speech. They also have a deep interest in inclusive language and a lot of knowledge on this topic.

It all started with lots of research

Our first step was to compile a list of non-inclusive terms that our feature could highlight, and their inclusive alternatives. Fortunately, there are many sources available online that we could consult to help us create such a list. For example, the APA Style guide provides a lot of advice on how to write more inclusively. There are also many guides created by activists and community members. For example, the disability activist and writer Lydia X. Z. Brown created this incredibly helpful guide to avoiding ableist language.

However, compiling the list was not as simple as copy-pasting the terms we found in the guides. We took a critical approach and made our own changes, additions, and deletions. One of our guiding principles was that we always wanted to center the voices of people who belong to communities directly affected by the specific language. We used our own experience of belonging to certain communities, and/or did follow-up research, to ensure this.

For example, many inclusive language guides advise using person-first language (such as a person with a disability) instead of identity-first language (such as a disabled person) when talking about disability. However, there is a huge number of disabled people who actually prefer identity-first language. A lot of them are really against the advice of person-first language being better than identity-first language. For example, in an article titled Why Person-First Language Doesn’t Always Put the Person First, disability rights activist Emily Landau argues that this advice is actually rooted in the stigma against disabilities. Needless to say, we didn’t want to include identity-first terms such as disabled person on our list of non-inclusive terms.

The next step was the implementation

The challenge of context-dependence

Once we had a list of non-inclusive terms and their alternatives, we moved on to our next challenge. We needed to make sure that the inclusive language check only asks you to use an alternative if you’re actually using non-inclusive language. If we just searched for the terms from our list in your text, and always asked you to replace them, that would lead to a lot of inaccurate feedback. That’s because language is highly context-dependent. There are many terms that are okay to use in certain contexts but become non-inclusive in others.

Some examples:

  • The term First World is not inclusive when referring to a country or region. But if you’re talking about the First World War, the words First World change their meaning and become completely unproblematic.
  • The word guru is not inclusive when used as a general synonym for expert or mentor. But if you’re referring to an “actual” guru – a spiritual guide in religions such as Buddhism or Hinduism – it is perfectly appropriate.
  • The pronouns he, his, him, and himself are not inclusive when referring to people in general (for example, “Everyone has his own preferences)”. But of course, you can (and should!) use these pronouns when talking about a specific person who uses these pronouns.

For a human, it’s often clear at a glance whether a term is inclusive or not in a given context. But it can be a lot harder for a machine. And that was one of the challenges we were facing.

Solutions to address context-dependence

In the end, we came up with two different solutions for addressing context dependence:

  • Some terms, such as First World, only become non-inclusive when followed or preceded by specific words (such as War). In those cases, adding a simple rule to our algorithm was fortunately enough. So for example, we tell our algorithm to not target First World when followed by War
  • In the case of context-dependent terms for which adding such a rule is not possible, we had a different strategy. We always target those terms, but you will never get a red light when you use them. Instead, you will see an orange light and a feedback string that explains in which context the term is inclusive. For example, if you use the word guru, the feedback will say: “Be careful when using guru as it is potentially harmful. Consider using an alternative, such as mentor, doyen, coach, mastermind, virtuoso instead, unless you are referring to the culture in which this term originated.”

So, if you get this feedback and you are using the word guru to refer to the culture in which this term originated, you don’t have to do anything (remember, it is not necessary to make all your light green!).

Remaining challenges

With these strategies, we manage to target a lot of non-inclusive terms. But there are still some remaining challenges. For example, the pronouns he, his, him, and himself are very common words, and they are most often used in an inclusive way (talking about a specific person who uses these pronouns). Sadly, there is no simple rule that a machine could use to tell apart the inclusive and non-inclusive uses. And while we could always target these pronouns with an orange light, this would lead to a lot of cases when people get feedback only to find out that they can ignore it. We thought this would be an annoying experience, and so for now, we don’t target these pronouns at all. Maybe it’s something our team will find a nice solution for in the future, though!

Try out the analysis in Yoast SEO

The inclusive language analysis is available in both Yoast SEO free and Premium. If you’re a user of our plugin, make sure to activate this analysis and give it a try! Here’s a preview of what it will look like while working on your content:

example of a check in the inclusive language analysis in Yoast SEO

You can activate it by going to Yoast SEO > General > Features and toggling the inclusive language analysis switch. If you’re not using Yoast SEO currently, but do want to give our inclusive language analysis a try, get the Yoast SEO plugin now.

Coming up next!