Why relative URLs should be forbidden for web developers
Many web developers still use relative urls in their CMS. A relative url is a url that is not complete. Usually it’s just the last part (the path) of a url, which means the domain name is left out. It’s often used by web developers, because it comes in handy when moving content from a test or staging environment to a live environment. However, there are so many disadvantages of relative urls for SEO that I strongly recommended against them.
All sorts of SEO problems on the web are caused by the use of relative URLs in links, canonicals and more. We find issues with them in our website reviews on a regular basis, but as you can see bigger sites like Twitter also have massive issues because of them. I’ll try to explain why you shouldn’t use them and what you could do instead, as it might be simple things like this that hold you back from performing well with your website.
What are relative URLs?
Relative URLs are all URLs that do not contain a fully qualified domain name and path, but instead just the path or a portion of the path. So when your website is example.com, you could be linking to your contact page from your homepage like this:
And back to your homepage from your contact page like this:
The / refers to the directory / on the domain. So even when you’re three levels deep in a directory structure, linking to / would link to the frontpage. Lastly, when you’re on the corporate page of your about section, for instance example.com/about/corporate.html, you could link to your contact page like this:
All the resulting URLs are calculated by your browser based on the base URL. By default, this is the current URL that’s in your location bar, but using the
base element, you could set it to something else, like this:
Doing this would make the second link above, the link to /, resolve to http://www.example.com/subdirectory/.
This was all fine when HTML was invented and websites consisted of real static HTML pages in directory structures. Now though, most of the web is built with content management systems, changing URLs is easier and some URLs might behave differently than what you’d expect. Because of that, relative URLs can cause a few different types of issues, all of which can be pretty detrimental for your SEO and your server performance.
Why are relative URLs still being used?
Relative URLs are often used because developers have a test environment on another hostname and it makes it easy for them to move stuff between their test environment and their live environment. Other reasons include that it’s “just easier in website maintenance”. They’re also, in my opinion falsely, promoted by some websites about site speed because they’re “shorter” and thus “faster”.
In reality, all of these reasons are false when you look at the bigger picture. The few minutes a developer might save by using relative URLs are offset by countless hours an SEO might be spending to solve the issues caused.
Some of the problems caused by relative URLs
Issues caused by the use of relative URLs are vast and plentiful, and any seasoned SEO can probably give you a few examples of clients that have had huge losses because of them. Let me show you a couple of them:
A completely indexed test environment
When you have a menu structure that relies on relative URLs, one wrong link in your content to your test environment would cause the entire test environment to be spidered and indexed, causing massive duplicate content issues. This happens more often than you think, in fact, have you checked whether the test environment you used to test your last few development projects are indexed by Google? I bet some of you will now find out that they are indexed.
Most of the times I’ve found what we call “spider traps” they’re caused by wrongly used relative URLs. Let me show you an example: a site linking to ./example instead of ../example/, from the /contact/ page. A link to ./ means you’re linking to the current directory. When the current URL ends in /contact/ this means that a link to ./example/ resolves to /contact/example/. So clicking that link would take me to http://www.example.com/contact/example/, if your CMS is set up to serve the same page for /contact/example/ as it serves for /contact/, which is a very common case, you’ll now have a spider trap. Because that /content/example/ page also links to ./example/, which now resolves to /contact/example/example/, which then links to ./example/ again and thus links to /contact/example/example/example/ etc. etc. etc. You probably get the issue, and I hope you also understand why this could be very detrimental for your search engine rankings.
These kinds of issues are very easily found using a tool like Screaming Frog, which I think every webmaster should have in its arsenal.
Relative canonical URLs
Issues can also be caused by using relative canonical URLs. A canonical URL is supposed to link to the “perfect” URL for a piece of content on your website. If you use a relative link and also have a subdomain or test environment that’s indexed, you suddenly have several versions of a piece of content that all proclaim themselves as the canonical version of that piece of content… You can understand a search engine having a hard time dealing with this.
A little knowledge is a dangerous thing…
At Twitter, they figured out that they shouldn’t use relative canonicals. So a developer there thought he was smart and probably defined the domain part of the canonical URL using the HOST header information. This causes the very issue that I talked about in the introduction above, because now the IP result in the screenshot above has a canonical URL pointing to itself, causing Google to show Twitter’s IP’s in search results everywhere instead of the proper domain…
WordPress core has this solution solved in a very nice way, using a couple of solutions:
Absolute URLs everywhere
Whenever WordPress outputs a URL, it’s always a full, absolute URL. For the domain name part of that it uses the domain you set in the General settings. This is the type of solution everyone should use: the domain name should be in a configuration file, this would allow you to still easily migrate between development environment and live environment by just using different configuration files.
Whenever WordPress detects that you are on a specific article but are not using the proper “canonical” URL, it’ll try to 301 redirect you to the correct version. For the cases when it doesn’t detect this (it for instance ignores query parameters added to the URL), there is:
The canonical link URL element
When you’re on a single post or page, WordPress puts out a canonical link element, based on what the URL of the current article should be, irregardless of what’s in your browsers location bar. Our WordPress SEO plugin extends this functionality to display canonical link elements just about everywhere within WordPress, and you should do this in your CMS too.
Twitter’s issue could be rather easily resolved, as we’ve discussed, by using proper absolute URLs everywhere in their code. There are no real good arguments against not doing that. While Twitter is not a direct e-commerce site and might not have the biggest of issues with losing a bit of traffic, I’ve had issues with relative URLs and relative canonicals at clients that have cost those clients upwards of a hundred thousand euro’s. The very small gain in web development time, if any, is never, ever, worth that.
So you should be using absolute URLs at all times, canonical redirects when possible and canonical link elements should ideally be on every page you serve out. After all, when you’re building a brand, do you really want to lose that brand in the search result pages? I think that’s a waste and I’m guessing you do too.
Coming up next!
WordCamp EuropeJune 08 - 10, 2023 Team Yoast is Attending, Organizing, Speaking, Sponsoring WordCamp Europe. Click through to see who we'll be there, what we'll do there and more! See where you can find us next »
Yoast SEO news webinar - June 20, 202320 June 2023 Our head of SEO, Jono Alderson, will keep you up-to-date about everything that happens in the world of SEO and WordPress. All Yoast SEO webinars »
60 Responses to Why relative URLs should be forbidden for web developers
Great post. Interesting info about Twitter.
Since WordPress adds the full URL to href and img src, and I typically build sites on a development subdomain, I just add this to .htaccess a few days post launch (after propagation):
This redirects the dev domain to the production domain.
Then, using phpMyAdmin, I run a script to replace dev.server.com with the production domain.
IMO: Fully qualified URLs just cause different problems. Root relative URLs should be your go to default until you identify one of the problems that only fully qualified URLs can solve (canonilization, etc.)
Thanks for a definitive explanation I can point to, the next time someone sings the praises of relative URLs. Since I work primarily with small WordPress sites, and can use WordPress plugins like Search and Replace, there’s never a reason for them. Absolute references all the way! Enjoyed the comments – you certainly attract smart discussion.
I suppose there are pro’s and con’s of both. Having a full URL would mean slower loading times because every page loaded has to requery the DNS, where i believe local relative url’s/page names dont.
Google likes fast sites…
for some reason, i always thought that relative URLs were a better way – but when i researched it – i never found any supporting arguments for that – I just though that using absolute urls were putting a load on the server – I am all for using absolute urls as it reduces all potential problems – great post
Didn’t know much more about it. Tnx mate for discussing it..
Thanks for the useful tip.. i used to prefer relative URLs, but the idea of having main domain name in configuration file solves that problem.
This article highlights a pet hate of mine, working on a website after another dev who doesn’t use fully qualified names! Great article as usual.
Great post @Joost and some really insightful comments to make it a practical guide for rookies
I’m sorry, I agree with Shawn K. Hall above.
1st, I have worked for some rather large companies like ETrade and doing what you say simply is not possible. In my day there, we had 5 different environments, more if you count each developer’s version on their dev machine. In a home grown application like that relative links are the only way to make the website work. Not every website is as simple as wordpress where it can build the full URL through the config.
2nd, I have a few years of SEO experience as well and you can overcome all of your problems with one simple rule: “Always use root relative links” as Shawn mentions above.
I have been using them for years on our websites and clients and have never had a problem with them resolving.
Never use ./ or worse ../ if you can avoid it, but I see template developers use it all the time for parts.
Ironically, this article is unreadable on an iPad–the left side of the page gets truncated, unless you happen to scroll all the way to the bottom and try to post a comment, at which point Safari realizes something is wrong and adjusts.
You should be more careful with your statements and assumptions sir.
Telling us that all relative urls should be forbidden because they produce not expected results when used inappropriately is not a proof.
Not sure if I can trust someone who uses ‘Irregardless’…
I hope you are not referring to CSS background urls. That is a nightmare to maintain when you take into account that when development starts assets aren’t yet on production + we are dealing with prod, staging, qa, and development urls.
On the plugins site
16 people say it works.
12 people say it’s broken.
I noted my problems here http://wordpress.org/support/topic/plugin-wordpress-seo-by-yoast-a-lot-of-bugs-in-newest-version?replies=2
And I say it again, a plugin should be easy to use (Plug and play) and with less problems from a version to another.
In my opinion you are partial wrong/right. Drupal, for example, uses relative urls, even in canonical tag.
But I think there should be specified a base tag in the head. And this base tag should have a absolute url.
P.S.: Your WordPress SOE plugin is a nightmare from version 1.2 until now. Hope you will fix it.
“Drupal uses it” so it must be good? Wrong.
As for WP SEO, it’s working perfectly fine on most sites, so I wouldn’t call it a nightmare.
Thanks for the post Yoast.
I would like to also voice my like for ScreamingFrog – it’s an invaluable tool to have in the kit and I’ve been using it happily for years.
Great for doing reconnaissance on client sites that have ‘issues’.
I had no idea that this was such an issue. I’m going to go back and fix my links now. Thank you kindly.
I’m kinda a beginner to wordpress, but it appears that the default menu feature (Appearance > Menus) outputs relative URLs. Is this what your talking about as a no-no? Is there any way I can change this default behavior so all URLs are outputted in full? (ie http://www.stridepestcontrol.com/services/ instead of /services/) Thanks for the great article.
Joost, well argued, regardless of whether Twitter itself is correctly configured. The prior argument of which I knew against relative URLs was that they made it easier for a scraper to steal your pages and just slap them up on another domain with minimal editing. That never seemed to be a very convincing argument, but it was out there regardless. Thank you for adding some better arguments!
(That Reply link doesn’t seem to do much in Google Chrome)
Fixed, that was a bug from me messing around with Bill’s stuff :)
I’m not seeing a wp_config.php option to store relative URLs in the database. WP_SITEURL and WP_HOME might be what you were thinking of, but image and widget URLs, et al, don’t seem likely to be affected.
Which is why the “scrubbing” feature in DesktopServer is a nice convenience feature. However, when coding with WordPress’ API, one should really leverage the get_bloginfo(‘url’) function.
Yeah I’ve got DesktopServer on my Desktop to test it out :)
Gonna have to disagree in part.
1st when you say there is no good reason for Twitter not to use absolute URLs you missing something that an SEO may not be focused on; the cost of bandwidth. For 99.9% of websites it’s not a valid concern but for the volume of HTTP traffic Twitter processes removing roots domains from can have a sizable impact on the cost of their bandwidth.
2nd, one of your justifications for why relative URLs are bad is because they can result in Google indexing them. As Gigi said, it can also be a problem if you forget to update one of your test URLs and thus allow Google to index it. But adding to that when we build a site we add a sniffer for Googlebot and similar bots to serve a 403 Not Allowed when we are running a site that is not a production site. Problem solved.
3rd another of your justifications for why relative URLs is because of the recursion problem, which I definitely agree with, but that doesn’t not affect root-relative URLs so it’s a bit disingenuous to use that as a justification against all relative URLs.
On the con side, the http vs https is definitely one of the bigger concerns.
As for me, while I don’t mind absolute URLs I prefer root-relative in most cases. Unfortunately with WordPress I can’t easily get that. But other than the http vs. https issue which isn’t always a concern, I don’t see other real downsides to root-relative URLs. Am I missing something?
For those who want to do a “search and replace” of URLs because you are developing on a test, I have a PHP script here that we developed at Choice OMG. It basically replaces every instance of an old URL with a new one.
Let’s not forgot we all should be utilizing a CDN which would make this a moot point. You need the absolute to use a CDN.
I disagree. You’re making assumptions that simply aren’t true. For example, you claim that using relative URLs is the cause of the problem in this case. That’s not true, and in fact, is a saving grace for Twitter in several other ways.
First and foremost, you must understand *why* Twitter (or any other site) would do this. The answer is to be more universally accessible. Twitter has fought hard to preserve it’s availability in countries where sharing opinions can be a crime. In China, India, Hong Kong, Germany, and other countries there are literally thought crimes for voicing dissent or posting information contrary to the govt approved position. In the past, countries have prevented access to sites (including Twitter and Facebook) simply as a matter of preventing discussion or even citizen awareness of events like the Wang Lijun situation earlier this year. To prevent this censorship, which is usually effected by a DNS filter, Twitter is available via it’s IP addresses directly. However, in doing so, it’s important that the site remain navigable, which is only accessible via relative URLs *or* if the absolute URL outputted to the client includes the IP address instead of the domain name. Either option is acceptable and fully functional, but the absolute URL method requires more server resources.
The issue here isn’t a matter of the relative URLs used being the problem, but rather that the canonical URL that Twitter is pushing *does* include the absolute URL, including the active domain or IP address instead of ‘twitter.com’. This is bad code, plain and simple. The problem you’re complaining about isn’t being caused by relative URLs, but invalid canonical links.
Even so, I’ve always believed it’s bad form to use relative URLs with dot-syntax, and encourage developers to use root-relative URLs or absolute URLs whenever possible. As you aptly described above, way too much can go wrong with the dot-syntax to rely on it for either proper linking or bots, which typically can’t be trusted to parse the URLs correctly.
I am glad to be using WordPress. When I started reading this, I thought: Oh God! but then everything is ok, I am not using test environment right now, so, this is really ok. But it did clear a lot of doubts I had concerning this, thanx
Great Article Joost..Thanks for sharing..:)
Lastly, never heard of Screaming Frog ,Looking forward to use this tool.
Then less you let the spider guess.. The best and quickest result you gonna get. Seems logic
I always use relative links to the root to avoid spider traps. I see that you mention this is also problematic, so to clarify, is an automated system also prone to user error (one that would add the server name in each relative link)? so are you advocating using some automated script that will replace a test domain to the main domain while publishing?
This is a server configuration issue, not a development issue. Relative URLs, including protocol-relative, are just fine. Better, in fact, due to the dev/staging server situation mentioned in this very article (but note that common practice is to use “/contact”, not “../contact”, so all URLs are relative to the site root — plain static HTML pages excepted).
The key is to be sure your dev/staging environments are not public. And that’s for all sorts of reasons, search engines being the least of them. If your test site is open to the public, you have bigger problems than SEO.
WordPress does handle this well though if you configure WP_SITEURL and WP_HOME in your config file instead of relying on the info in the database. It’s easier to move WP between these different setups than many other CMSs. But that only works if you rely on WP to generate all your internal links. You want to avoid this when you’re building a custom WP site, and you also need to avoid hard-coding your public domain name everywhere, so relative URLs are the way to go.
In the part about why relative URLs are still used, I think you missed the real argument on why they are “faster”. It doesn’t have to do with them being shorter, it is about DNS lookups on the client browser. Permalinks/aboslute URLs in the past required a DNS lookup for each one, slowing page loading by a little bit (depending on the speed of the end user’s DNS). Relative URLs didn’t (in the past) cause this.
I believe most browers cache DNS results now, so it’s a moot point. And, even if they don’t the small difference and does not make up for the fact that relative URLs cause all sorts of issues.
Resourceful post, thanks Joost for sharing, from now we’ll take care while creating new pages.
Agree with you, we’ve faced 404 error due to this already.
For a canonical or base href, I’d always recommend a full URL. But if those are correct I actually think it’s quite difficult for even a semi-competent developer to screw up standard internal relative links, even on a large site.
The issue isn’t relative links themselves; it’s people using them incorrectly. So I don’t agree usage should be stopped and I can’t see support for relative links being dropped by engines etc any time soon.
I see plenty of sites with bad header redirects or poorly implemented URL rewriting; that doesn’t mean header redirects and URL rewriting should be forbidden!
Just my opinion though; if I’d seen as many sites as you probably have with this issue, I might change my mind :)
Thank you for posting this! I have argued with developers for years about the issues with taking a website live with relative links. Now I have some authority to back me up! What is the suggestion for php includes though? We use them for headers, footers and a number of other functions; is there a proper way of making sure they includes don’t break when calling them in php?
Do not take this as the whole story a lot of professional software engineers will disagree with this very strongly and for very good system and design reasons.
Relative is on the whole a more ‘ideal’ design than using absolute.
Any half decent spider will not be confused by the use of relative paths.
Yoast…good post. This is a battle I have been fighting with developers for years. Yes, WP has solved this problem and because of that..I think when most development groups work on a WP site, its fine.
The truth is, not every site is developed in WP and there is where the battles happen.
Just to make clear something: are you talking just about relative URLs in HTML or in CSS too? (not clear because of “Protocol-relative URLs” paragraph)
“Whenever WordPress outputs a URL, it’s always a full, absolute URL.”
Yeah, that would all be fine if WordPress did really use a “configuration file”. Instead it stores the current URL in the database, so when you later change the configuration there are still old URLs being emitted.
You can hardcode it in WP-config.php using a define; check the codex :-)
No, he’s right. Check your post content in the database. All urls are absolute, referencing the host in which they were created. If you authored on staging, the absolute URL will be staging, regardless of what the constant is for the current environment. Why WP doesn’t calculate the absolute urls at run time is beyond me.
If you ever migrate content you’ll either need to find/replace on a sqldump file, or run replace queries. So If you move content from a dev to production environment with any frequency, WP actually makes you more susceptible to indexing a dev environment than we’re you to use relative urls.
Elliot is right about the DB from WordPress. We had a outside development firm work on our site, install it, and then when I got free enough to take a look, there were about 100 references to their Dev environment in the Database though the URL was updated in the config.
Not hard to fix, but lot’s of folks leave that step out when transferring. WordPress recommends http://interconnectit.com/124/search-and-replace-for-wordpress-databases/ when transferring sites to update domains in the DB. Simple, and works really well.
Here’s the link from the WP Codex that recommends the above tool.
I too prefer libraries that extrapolate most all URL handling into a config file…makes life simple.
Another helpful and informative tip. I never knew the difference between and relative before but the good thing is I am practicing the so-called absolute URLs.
Thanks again Yoast !
Thanks for the tip Yoast…I’m not a Developer but I believed that relative URL will put less load on Server… Thanks for clarification and it will make a huge difference here..
@Roy – yeah – we never make mistakes when launching a new website.
Ok, can you then remove the robots.txt again, please?
I’m not really sure this is the case for all the options in the post.
The issues you talk about in “A completely indexed test environment” and “Spider traps” and issues that can happen with relative or absolute URLs. The issue is connected to an error and an error can be made in both cases.
If you can miss a dot in “./contact” you can also miss a dot in “http://wwwexample.com/”, don’t you?
I agree regarding the canonical URL. If we identify the perfect page where the canonical should point, we need to use the complete path, and not a relative URL, BUT I don’t agree on the solution you propose for it.
Make a 301 redirect for all the canonical URL is going to be a problem in all the case where your contenct is prety much the same BUT something is different: take as an example the famous e-commerce website with the same shoes and the same content, but with dirrerent colours. If I apply a 301 redirect to my “perfect page” if a visitor try to see the blue-shoes page (same content and linked with a canonical to the perfect page), he will be redirected to the canonical…
Am I completely wrong or I got some points?
useful stuff…will practice using absolute urls instead of the relative urls :)
Indeed it does. And when they promise that the dev environment never gets passed the firewall and proxy. How come that ww2. and dev. are indexed?
The hurting comes full circle when developers want to implement the canonical URL in a relative fashion.
Yesterday I discovered KLM used this as well.
Klm.com vs klmpinata.com :-)
Its just a matter of time now to have their search results screwed up.
Very good post to clear it for once and for all. With the simple examples everybody can understand it from novice to the content scanning pro’s :)
Nice article Joost. I’m wondering, if you’ve got a link within a post or page of WordPress, how would one use an absolute link if they’re going to be moving the site back and forth from a staging site? This is the main problem we’ve had with absolute URLs. WordPress does a really good job of managing the template URLs.
The way I do it is have a base-url constant somewhere (in WordPress I usually place it in the config file) and use that in all my links. You cna either change that constant manually on each server or do it automagically (via $_SERVER e.g.)
I’m webdeveloper and I know what do you mean. Thanks for sharing useful information. I will use your tips on my work in new projects..
Awesome WordPress Plugins, by the way… Thanks Yoast :)
I can’t remember how many times I have recommended absolute URLs and some developer has come back saying “it doesn’t make a difference”, my response – yes it freaking does if even if you have a base tag.
Put everything aside, you are introducing another layer of processing for spiders when you use relative URLs. The spider has to figure out the the absolute URL. If you give it absolute URLs in the first place then it doesn’t have to do another task.
404 issues, migration issues, canonicalization issues stem from relative URLs.
Thanks for the brilliant post!