DIY: Duplicate content check

DIY: Duplicate content check

June 22nd, 2016 – 10 Comments

Duplicate content might confuse Google. If your content is on multiple pages on your or other websites, Google won’t know what to rank first. Prevent duplicate content as much as possible. Perform a duplicate content check every now and then to find copied content.

In the Advanced > RSS section of our Yoast SEO plugin, we have predefined a snippet to add to your feed entry saying “This article first appeared on yourwebsite.com”. The link in this snippet makes sure that every scraper includes the link to the original article. Of course, this already helps to prevent duplicate content, as Google will find that backlink to your website.

Nevertheless, if you write awesome content, your content will be duplicated. And that copy won’t always include a link to your website. All the more reason to do a duplicate content check on a regular basis. In this article, I will show you quick ways to find duplicate content for your website.

CopyScape duplicate content checker

There are a lot of tools to find duplicate content. One of the best known duplicate content checkers probably is CopyScape.com. This tool works pretty easy: insert a link and CopyScape tells you where the content on the other page is:

CopyScape: duplicate content checker

That’s step one. It will return a number of results (9 in this case), presented like Google’s search result pages. Simply click one for more details.

duplicate content checker - CopyScape

In this case, 2% of the Creativ Form page is copied from our website. CopyScape nicely highlights the text they found to be duplicate. By doing so, this duplicate content checker will give an idea of how severe the copying is. If it’s just 2% of the page like in this case, I wouldn’t worry. If it’s like over 40%, that makes quite a large part of the other page and I would simply email them to change the copied text.

By the way, dear Creativ Form. If you want to copy our content, please tailor it to your website. “In this article” makes absolutely no sense in this case :P

By the way, we frequently find manufacturer descriptions used in online shops to be duplicate. Usually, these are automatically imported into the shop’s content management system. Usually, not just for your website. Be aware of this. I understand it’s quite the hassle to write unique product descriptions for every product, but at least start with your best-selling products and take it from there. Start now.

Use the CopyScape duplicate content checker to find copied content from your website on other websites. Again, it’s one of many tools but this one’s free and easy to use. If you want to dive a bit deeper into your duplicate content, CopyScape also offers a premium version for more insights at 5c per search.

Siteliner internal duplicate content check

Siteliner is CopyScapes brother that searches for internal duplicate content. This duplicate content checker will find duplicate content on your own site. A very common example of this is when a WordPress blog doesn’t use excerpts but shows the entire blog post on the blog’s homepage. That simply means that the blog post is available on at least two pages: the homepage and the post itself. And probably on the category and tag overview pages next to that. That’s four versions of the same article on your own website already.

The advantage of using excerpts is that the excerpt always has a proper link to the post. This link will tell Google that the original content is not on that blog/category/tag page but in the post itself. I think we recommend the use of excerpts in half of all the WordPress website reviews we do. That also means half of the websites actually has this internal duplicate content issue.

The Siteliner duplicate content check will show you a lot of things, but limited to 250 pages and 30 days. Again, there is a premium version, but the free one will already give you a good idea. Just do a search, find the overview page and please click to details. Don’t get scared by high numbers of internal duplicate content, as this duplicate content check even tells you the excerpts are duplicate content:

SiteLiner: internal duplicate content check

Percentages

Where Google understands what a sidebar is, CopyScape and Siteliner seem to include all text on a page in their percentage calculations. Please keep this in mind when you use on of these duplicate content checker. The actual percentage of the duplicate content, when just looking at the main content of a page, might be higher. Just a head’s up!

Am I worried? No. Simply click one of the links and check if it’s indeed the excerpt (it is). The total of the matched words is 223, but in fact, the ‘duplicate part’ is just 57 words of 1,086 words in total in the main content section of that article. And the excerpt obviously links to the post, so we’re covered.

Manual duplicate content check

CopyScape and Siteliner are nice, easy-to-use duplicate content checkers. However, if you want to see what’s duplicate according to Google, you could also use Google itself.

If you have a certain page that you’d like to check, simply go to that page. Copy a text snippet, preferably from a section that you think might be attractive for others to copy. Insert the exact snippet in Google using double quotation marks like this:

Duplicate content check in Google

“WordPress is one of the best, if not the best content management systems when it comes to SEO. That being said, spending time on your WordPress SEO might seem like a waste”. Limit that phrase to 32 words, as Google will only take the first 32 words into account. This search query returns ‘about 517 results’ according to Google, which is well over the 9 results CopyScape returned.

Check your own duplicate content

Use a duplicate content checker like CopyScape to find what has been copied from your site, and use Google to see where else on the internet this content ended up. These are simple tools that serve a higher goal: to prevent duplicate content. If you want to read more on duplicate content, start with our Duplicate content: causes and solutions article. Or visit our duplicate content tag page.

Read more: ‘rel=canonical: the ultimate guide’ »

 


10 Responses to DIY: Duplicate content check

  1. Allameh Tabesh
    By Allameh Tabesh on 4 July, 2016

    Oh! Copyscape is not 100% Free, has premium version.
    any 100% portal like copyscape?

  2. ahmet
    By ahmet on 26 June, 2016

    Thanks very useful article

  3. 250
    By 250 on 25 June, 2016

    thank you for post
    there are many useful and interesting article .

  4. Runo
    By Runo on 24 June, 2016

    I have more then one domane, there for I sometimes have to use a “old” used text from one domane on a other domain after a wile. I delete the text on domaine one and use it on domaine two. The text will still be on the online “Check for plagiarism” Is that a problem for long time ?

    • Michiel Heijmans
      By Michiel Heijmans on 28 June, 2016

      I guess Google will figure out that the text is gone on domain one. Matter of time. So no real problem there, I guess!

  5. roshan sathe
    By roshan sathe on 24 June, 2016

    Can we check the content of our article before publishing?

    • Michiel Heijmans
      By Michiel Heijmans on 28 June, 2016

      It’s not duplicate as such until you publish it. But I bet you know if you copied content or not?

  6. Stuart
    By Stuart on 23 June, 2016

    “a snippet to add to your feed entry saying “This article first appeared on yourwebsite.com”. ”

    I cannot find this in your XML section of your plugin?

    Can you help?

    • Michiel Heijmans
      By Michiel Heijmans on 28 June, 2016

      Yes! I totally was wrong there, it’s in the Advanced > RSS section. My bad, very sorry for the inconvenience.

      Here it is:
      Advanced - RSS
      And adjusted in the article accordingly. Thanks a lot for pointing that out, guys.

    • Marcus
      By Marcus on 27 June, 2016

      I can not find in the Sitemaps section too


Check out our must read articles about Content SEO