Content Analysis with the WordPress SEO plugin

We’ve been rather busy with the WordPress SEO plugin the last few days. We did a release yesterday and a quick follow up today to fix a few collisions with other plugins. Loads of cool small fixes in there, but one in particular that I think is worth highlighting as it’s something other plugin developers might want to pick up on: a small but important change to our content analysis functionality.

content analysisFor quite a while now, the WordPress SEO plugin has had a page analysis function baked in. The name is misleading, which is why I’ll be changing it soon, as it’s actually not page analysis, but content analysis. If you give it a focus keyword to test for, it analyses the content of your post and gives you hints and tips on how to improve it.

Every once in a while, we’ll get bug reports, on GitHub or through email, telling us that we’re wrong, and that we should analyse the entire page when doing the content analysis. I disagree, which is why we’re not doing it. Let me tell you why I disagree first.

Web Page Segmentation and Content Analysis

Search engines have been able to analyse the content of pages on a block level for quite a while now. Going into the specifics would take too much time here, but if you’re interested, read this post by Bill Slawski from 2009 or even this one, about a Google patent from 2006. Basically, search engines are able to tell what the content bit of a page is, what the sidebar is, what the footer is, etc. Using that segmentation, they judge your page by judging just the content section of it.

Building block level recognition like that into my content analysis function would be…. Undoable. Especially because we know what the content is, so we can just take that and ignore all the other bits. Oh and I’m not even half way smart enough to do the kind of segmentation search engines do and keep your WordPress site running smoothly.

So the content analysis just fetches the posts or pages content and runs it analysis on that. It’s clean, it’s simple and it’s rather fast.

The Issue with Focussing on Post Content Analysis

There’s one issue with this approach. The issue is that WordPress is being used more and more as a CMS. People are adding different blocks of content to pages in more and more ways. Plugins like Pods and Advanced Custom Fields are allowing people to be more flexible with their content blocks. We had to come up with something for that.

Another issue was that we didn’t parse shortcodes when doing the content analysis, causing us not to recognise galleries correctly, the native gallery or galleries added with for instance Next Gen Gallery. This meant we didn’t properly recognise all the images in a post and thus couldn’t output them in XML sitemaps and OpenGraph tags.

Now you might remember from installing the plugin, if you’re a user, that we ask permission to anonymously track data about your site, we collect that data specifically for these kinds of problems. Through this tracking database, which currently tracks about 650,000 sites, we looked at how big this particular issue was. We know that of users who run our WordPress SEO plugin, about half of the sites we track, 10% also run Next Gen Gallery. Pods and Advanced Custom Fields aren’t as popular, but they are both growing, rapidly. So it’s a serious and growing problem. Time to fix it.

The solution

Yesterday, in 1.4.14, we had a first patch that tried to parse shortcodes to discover images for use in our OpenGraph tags. The results were painful. Apparently, loads of plugin developers don’t really understand how a shortcode should work according to its API, so it broke, on loads of sites, horribly. Several plugins suddenly failed, simply because we were doing a do_shortcode outside of the main body and the shortcodes were echoing instead of returning their content or doing rather ugly things to the post_content attribute of the post global. I have to say: that shouldn’t happen. But it did.

So we released 1.4.15 today, which reverted that code. And now we’re left with only one option: providing plugin developers out there with a simple filter. This filter is called wpseo_pre_analysis_post_content and takes 1 argument: a string containing the post’s content. It’s used in several spots within the WordPress SEO plugin, with more to come, and it allows plugin developers to add their custom fields content to the content the plugin analyses by just adding on to that string.

It’s a simple enough change for us to make, but it opens up a world of possibilities. I hope people will use it and I’d love for you to tell us in the comments if you do!

Tags: ,


Yoast.com runs on the Genesis Framework

Genesis theme frameworkThe Genesis Framework empowers you to quickly and easily build incredible websites with WordPress. Whether you're a novice or advanced developer, Genesis provides you with the secure and search-engine-optimized foundation that takes WordPress to places you never thought it could go.

Read our Genesis review or get Genesis now!

67 Responses

  1. YousafBy Yousaf on 22 August, 2013

    Good work!
    Will have to test this on our site as we use ACF heavily.

  2. Phil SingletonBy Phil Singleton on 22 August, 2013

    “to fix a few collisions with other plugins”…thanks so much, was pulling my hair out installing WPSEO on some new client sites. Never had a single issue before the last couple weeks. Two slider meltdowns and once theme conflict.

  3. AndreBy Andre on 23 August, 2013

    Sometimes it pays not to update too soon ;-)

    And! I did not know Pods. I have tried it and i like it. Tnx for the “recommendation”! It is perfect for database-content inside a blog.

    Regards from Germany,
    Andre

  4. Mark D. HulettBy Mark D. Hulett on 23 August, 2013

    Good stuff as usual Joost. Totally agree that the focus should remain on the “content” area and not the entire page.

    Keep up the great work!

  5. Jon BrownBy Jon Brown on 23 August, 2013

    I’ve seen a bunch of plugins basically write their own short code API (using the term generously) that do horrific things to the post content. Most recently I think it was JW Player, but I’ve seen share buttons do similar bad things. Doesn’t surprise me things broke.

    It does surprise me, again, that this broke in release rather than any sort of beta/testing. That really needs to change and I don’t just mean for the upcoming pro version. Get a lot of knowledgable people running your beta code and test it. Stop testing on the main user base in production. Please!

    • Joost de ValkBy Joost de Valk on 23 August, 2013

      Jon, you don’t know of which you speak. I’ve had 10 or so reports of breakage, which I consider a lot, but by that time the plugin was already running on tens of thousands of sites. Most of the plugins that broke were obscure slider plugins and obscure themes, there was one “big” plugin that broke in one particular configuration. Exactly what in that plugin broke is something I’ll research today and file a bug report with the authors on.

      I test in loads of environments and with loads of plugins but you can’t account for having a million+ users and all their different configs.

      • Jon BrownBy Jon Brown on 26 August, 2013

        Yoast – LOL… I was basing my comment of your report in this blog post that it broke. To quote you “…so it broke, on loads of sites, horribly.” If I don’t know of what I speak, it’s because YOU are misinforming people.

        You can’t say it broke loads of sites horribly and then in the comments I only had 10 or so reports of breakage. I mean I guess you can, because you did, but WTH?

        • Joost de ValkBy Joost de Valk on 26 August, 2013

          It broke on more sites than 10, I just didn’t get more reports. But I wouldn’t have rushed a fix out if I didn’t think it was serious.

          The question is: could better testing have prevented this? And to be honest… I doubt it.

  6. ArminBy Armin on 23 August, 2013

    Vs. shortcode echos:
    Wouldn’t it have been possible to replicate the_content and parse the returned string in a regexp to get the images? This way you circumvent the shortcodes.

    I would hook this to the save post event and store the result in a meta, so you don’t do this on every page display (since it might be a bit intensive).

    • Joost de ValkBy Joost de Valk on 23 August, 2013

      Well the things I’ve seen yesterday strongly make me think that’d crash the save process…

      • Mark D. HulettBy Mark D. Hulett on 24 August, 2013

        Can confirm that if hooked into the save post event does causes issues… :-(

  7. JonBy Jon on 23 August, 2013

    Thanks for the update. I have used wordpress seo on a number of sites and I find it very intuitive to use.

    I hadn’t appreciated that the page analysis feature was available until now – I tend to use the page grader from Moz – but this is a really helpful feature when developing content.

  8. HassanBy Hassan on 23 August, 2013

    Thankfully, I did not have anything break after updating to v1.4.14, but nonetheless I updated to v1.4.15 just in case.

    Off-topic: Joost, I recently sent you an email to joost @ this domain titled “Migrating translations to Transifex‏”. I’m not sure if you were able to read it, but I’d like to know your response if possible. Thanks.

  9. SueBy Sue on 23 August, 2013

    A little off topic but I just wanted to say a massive thank you for producing the SEO plugin. I’m trying to get my head around SEO and I found your plug in really easy to use. I look forward to seeing the results.

  10. Andrea CimattiBy Andrea Cimatti on 23 August, 2013

    The wpseo_pre_analysis_post_content filter seems a very flexible solution. One question comes to mind: why doesn’t the WPSEO consider the featured image as part of the content by default? It is certainly page specific and standard.

  11. DealstanBy Dealstan on 23 August, 2013

    Great work Joost, as always!!

    Really, content will be the focus, and at dealstan, we are focusing more on contents.

    Until now, Google is loving us.

  12. Jennifer AshtonBy Jennifer Ashton on 24 August, 2013

    Hey Joost,
    Purchase your WooCommerce SEO extension yesterday and I had say “I just love getting those little green lights to come up on my products”. We have a lot of work to do with 200+ Organic Products but wanted to share our excitement and thank you for a wonderful, easy to use wordpress plug-in. Can’t wait to start seeing the traffic results! Jen,

  13. Jan MerrifieldBy Jan Merrifield on 24 August, 2013

    Thanks for this information and keep up the good work in this website. Cheers!

  14. Mathias FosterBy Mathias Foster on 24 August, 2013

    Keep up the great work Joost! Your plugin is definitely the best SEO plugin out there.

  15. Ruairi PhelanBy Ruairi Phelan on 24 August, 2013

    The wpseo_pre_analysis_post_content() filter is an elegant solution. Your post’s Joost are a fantastic resource for learning WP plugin dev, in addition to all things SEO, as they are very informative and honest. Thanks again!

  16. Matthew JonesBy Matthew Jones on 25 August, 2013

    >trying to optimize content
    >2013

    Pick one.

    Synonyms and the world of co-occurrence is upon us.

    • Joost de ValkBy Joost de Valk on 25 August, 2013

      Matthew, you’re vastly over-estimating their impact if you think that removes the need for content optimization.

  17. Stephen BrianBy Stephen Brian on 26 August, 2013

    I am using your SEO plugin for at least 80 percent of my sites and I am loving it so much for its useful features. Good to see the change in the functionality of content analysis as well.
    In every post, I am individually copying the Post Title in the Meta Title box but while indexing it is showing as “Post Title+Site name” in SERP. Any idea Joost why site name is also indexing besides post Title? Thank you very much.

  18. Rana IrfanBy Rana Irfan on 26 August, 2013

    Hi Joost here is an other great post. I really enjoy it good work. I have a problem.
    I am using SEO plugin by yoast. In Google Webmaster tool I saw duplicate meta description and duplicate title tags. Please Dear I am so worry about that. Please suggest me about seo plugin setting and how to remove duplicate meta description. I think it is my mistake about title tags. because i indexed tags. Because of this. Duplicate title tags happened. I will fix them but please i don’t know about Duplicate meta description. My 90% traffic decrease because of duplicate. Before I were receive 2500 visitor/ day but now only 250 visitor/day
    Dear Please I am waiting your reply i am so worry about that. I am thankful to you for this kindness. Thanks in Advance.

  19. James RobertsBy James Roberts on 26 August, 2013

    I have been using Yoast WP plugin for the last few months, and have seen a great increase in my search engine ranking on Google. It has made my life so much easier when it comes to effectively optimizing my web designs.My question is, are there any quality video tutorials I can recommend to my design clients so they can easily use Yoast? Thank you again for a great plugin.

  20. Yoyo SubagyoBy Yoyo Subagyo on 27 August, 2013

    very very good!! thanks

  21. Kiesha CanoriBy Kiesha Canori on 27 August, 2013

    Your all plugins are awesome and in almost all my sites I have used your plugins. Your content analysis integration to WordPress SEO plugin really helps a lot to analysis content in my blogs as I am allowing the guest blogger on my site. Thanks for creating such a wonderful plugin.

  22. JulietBy Juliet on 27 August, 2013

    Thanks for this information and keep up the good work in this website. Cheers!

  23. JesinBy Jesin on 27 August, 2013

    A note on focus keyword and content analysis.
    The plugin checks for an exact match of the focus keyword with the content while search engines understand better.

    Example, searching Google for “Create gmail account” also highlights results with “creating a gmail account”

    The content analyzer also takes into account punctuations. So a focus keyword “troubleshooting windows 8″ doesn’t match “Troubleshooting – Windows 8″

    • Joost de ValkBy Joost de Valk on 27 August, 2013

      I know it does, I’d like to fix that at some point but it’s a lot harder than you’d think ;)

      • Shane JenningsBy Shane Jennings on 13 September, 2013

        Would a short term fix for advanced users be possible… maybe the option to add our own regex? (with a use at your own peril disclaimer).

  24. manuBy manu on 27 August, 2013

    This is very nice post. Useful information in this post.

  25. CristianoBy Cristiano on 28 August, 2013

    Hi Joost ,

    When will the launch premium service?

    Is there any possibility for you to create an addon for BuddyPress.

    Thanks

  26. Thomas PBy Thomas P on 28 August, 2013

    I entered my details in your Blog Post sign up at the bottom of this blog post, hit enter and received this message:

    An error occurred…
    U heeft een lijst geselecteerd welke geen duplicaten toestaat. Deze e-mail staat reeds in het systeem. U kunt wel de bestaande inschrijver bewerken.

    Please fix!

  27. Manpreet KashyapBy Manpreet Kashyap on 28 August, 2013

    Hi Thanks for all your support with wordpress seo plugin..

    i want to set a category base to each of my category separately, is there any way to do this in your plugin.
    if it is currently not supported then please try to consider it in later release or suggest any other plugin that can help me.

    Great Sharing..!

  28. BrandonBy Brandon on 28 August, 2013

    I’m using the latest version of the plugin on the Striking Theme. WP version 3.6. The page analysis is saying “no images found” on the home page. There are about 6 in the Layer Slider, and 6 other images inside Striking’s image shortcode [image][/image] So, the plugin is definitely not recognizing images inside shortcodes…. :(

    • Joost de ValkBy Joost de Valk on 28 August, 2013

      Ehm no, it doesn’t. That’s what the post is completely about :-)

  29. Vance HallmanBy Vance Hallman on 29 August, 2013

    Friggin AWESOME plugins Joost! I have the Video SEO as well and after ONE WEEK of using the Video SEO plugin my ratings SKYROCKETED! Anyway, I have an easy solution for your post “content” analysis. Your solution lies in how you fixed the same problem with the Video SEO plugin by allowing the blog owner to type in a custom field that holds the video URL. You could easily do the same for the SEO for WordPress plugin. The only difference would be that you would have to have a +(PLUS) sign to the right of the first entry so that the blog owner could add multiple custom fields from mutiple other plugins.

  30. Vance HallmanBy Vance Hallman on 29 August, 2013

    One other question. How do we get a video plugin added to the supported plugins? I finally was able to get the one I use working with your Video SEO plugin.

    • Joost de ValkBy Joost de Valk on 29 August, 2013

      Sent us a patch through plugin support and we’ll very happily take a look!

  31. Jason DonniniBy Jason Donnini on 29 August, 2013

    So the filter needs to be added to the Advanced Custom Fields plugin in order to work?

  32. Mark LewisBy Mark Lewis on 30 August, 2013

    Excellent information, keep up the great stuff. It’s been a while since we’ve last heard from you…LOL

  33. DeeBy Dee on 31 August, 2013

    Great job – I’ve been using this plugin almost for as long as I’ve been using WordPress and it just keeps getting better. What would really make it a complete all-in-one solution for me would be the inclusion of elements like Dublin Core and Rich Snippets – maybe as a separate tab next to the content analysis?

  34. Ashok JariaBy Ashok Jaria on 1 September, 2013

    Very good plugin.I really appreciate…
    Thanks,
    Ashok Jaria

  35. JackBy Jack on 1 September, 2013

    Hmm…I just noticed that my Post Title Template (set like this: %%title%% – %%sitename%% – Special Needs Apps – Autism Apps) is only showing the Title and Site Name … not my verbiage of “Special Needs Apps – Autism Apps”. Any idea if this is related to one of the updates? My pages are set the same and are working fine. Thanks!

  36. marylrubyBy marylruby on 3 September, 2013

    Excellent plugin , it’s very useful in WordPress projects .

  37. AliBy Ali on 3 September, 2013

    Exactly what the plugin needed. Great job, especially with increasing demand for flexibility when using WordPress blocks.

  38. PeteBy Pete on 4 September, 2013

    Yoast – do you know if there’s a conflict with the Types-Views plugin? In my archive pages, when I embed a video embed code, it mysteriously disappears.

    Oh, I’m running your video seo and also wordpress seo with types-views.

  39. Alfred BeileyBy Alfred Beiley on 5 September, 2013

    Nowadays, content is the most important for website visibility and search engine ranking. Your content analysis plugin really help to analyze content in blog and website. Thanks for this great plugin.

  40. kishoreBy kishore on 5 September, 2013

    Awesome and marvelous explanation

  41. Posicionamiento WebBy Posicionamiento Web on 6 September, 2013

    Great work Joost, as always!!
    Thanks for the update. I have used wordpress seo on a number of sites and I find it very intuitive to use.
    Greetings from Argentina .

  42. RobBy Rob on 6 September, 2013

    I love the Yoast SEO plugin, I use it on all my WP sites, with the added advantage of the content analysis filter.

    Allowing you to easily refine your content to target kewords, after a few adjustments.

  43. marcelBy marcel on 7 September, 2013

    Hi excellent post

    tell me its is a possibilité to Automatiquelymy FOCUS KEY WORD : with the title like %%title%%
    I found no place to write it downs

    i have so many enterprise to file out manually that would help me alot if a certain base is done at first

    Tx for your answer
    ( DID i have a way to field automaticly my focus key word with %%title%% )
    tx

  44. NyssaBy Nyssa on 8 September, 2013

    My apologies if this is a duplicate post; nothing showed up the first time I tried to post this.

    I’d like to suggest an additional feature for the XML Sitemap: One thing I really miss from the Google XML Sitemaps plugin is telling it how often I want each part of my blog indexed: ie, front page daily, categories monthly, archives yearly, that sort of thing.

  45. Josh TrenserBy Josh Trenser on 9 September, 2013

    Wow, you are making this plugin better and better!!! Thanks a lot!!!

  46. RobbieBy Robbie on 10 September, 2013

    I recently migrated from AIOSEO to Yoast. Whilst I enjoy the content analysis feature on pages and posts, I have one problem. Previously when I published a post, it was indexed by Google within an hour. Since migrating to Yoast, it is taking Google closer to 24 hours to index my posts. Is this a known issue with Yoast? The only update services I ping are http://rpc.pingomatic.com
    http://blogsearch.google.com/ping/RPC2
    http://ping.myblog.jp.

    • Joost de ValkBy Joost de Valk on 10 September, 2013

      Not really a known issue, our posts are indexed within seconds :)

      • RobbieBy Robbie on 10 September, 2013

        Thanks Joost, this one is baffling me actually – I migrated to Yoast last week and published a post which was indexed within 5 minutes.

        Last night I published a post at 11pm and Google still hasn’t indexed it as of yet!

        I even followed your content analysis to a tee and got green lights all over the place (bar the flesch reading ease test) so am eagerly awaiting the almighty Google!!

        Thanks for your reply.

  47. J P NayakBy J P Nayak on 11 September, 2013

    Really I have no idea about this Plugin, but thinking about trying this.
    Does it analyze the keyword density, weight etc. ?

  48. Ryan J CropperBy Ryan J Cropper on 12 September, 2013

    Hey love the Post great Article, A lot of interesting Content witch I can relate to when it comes to writing Blogs, Keep posting and I’ll keep visiting 

  49. Ryan J CropperBy Ryan J Cropper on 12 September, 2013

    Hey love the Post great Article, A lot of interesting Content witch I can relate to when it comes to writing Blogs, The ONE Main thing I like About your Blog…and I’m sure others would agree is Your animations Around your page. lol, Classy LIGIT!

  50. SEO servicesBy SEO services on 13 September, 2013

    Great post. Awesome strategy, I know it works, after all i am on your site due to that reason, Great content! I haven’t use this but now thinking to try it.

  51. nurBy nur on 13 September, 2013

    Keywords always need to be exact match with url, title and content. But with my blog type i always getting trouble to match keywords with content.

  52. JoshBy Josh on 20 September, 2013

    Joost, thanks so much for your awesome plugins. Just brilliant! And I especially love seeing a clever online business that has built an audience through giving away something for free (your plugins) and using that to sell related and useful courses, services and affiliates resources. Well done!

  53. Paul ZenBy Paul Zen on 21 September, 2013

    Hi
    Am new to WordPress, though have been using joomla cms, a friend introduced me to WordPress and ever since I have been using it to create sites, I learnt your company
    developed the famous WP SEO PLUGIN, pls is there any documentation for this plugin as I will like to install and experiment with it. likewise how do I get access to your other plug ins. thanks