What is canonicalization, that is the question. But in what context?
Dictionary meanings vary, but include this broad definition which beings to clear the waters:
“Canonicalization is a process for converting data that has more than one possible representation into a “standard”, “normal”, or canonical”
If a mathematician is asking what is canonicalization, they may be looking for “lexicographical order”, but we are in the SEO game and when we are talking about canonical url’s we are talking about a way to identify the true and only url of a page.
I’m surprised at a few of the answers being given when people ask seo experts “what is canonicalization” and it seems there is a bit of misunderstanding in the industry. But the concept is very simple, and the reasons behind the introduction of the canonicalization tag are simple.
Websites respond to searches for thedomain.com as well as www.thedomain.com – even if, strictly speaking, the “www” is actually a sub-domain of thedomain.com. This leaves us in a position where, strictly speaking, two pages on our website are completely identical.
While most search engines are very good at identifying this common problem, there are other problems that give search engines a lot more trouble.
For example, you can include any number of variables on the end of a url (for example thedomain.com?random=variable) and these are identified as unique pages. But in truth you can add any variable you want and the same page will still load.
In some cases you may have an article in multiple categories, but it only has one true home. For example, thedomain.com/category1/thearticle might be the same article as thedomain.com/category2/thearticle.
SEO best practice is to have only a single URL for each page on your website, and to achieve this we use the canonicalization meta tag.
So to review, what is canonicalization?
A way to identify the one true URL for a page which can be access on multiple URL’s.
And why do we care?
Because this is SEO, and any tiny little advantage we can get stacks up.
Interestingly the first email I ever sent to my wife had the title “Canonicalization blurg” and included just a single line of text…
<link rel="canonical" href="http://www.thedomain.com/blahblahblah" />
Can you guess what she asked? “What is canonicalization and what is the canonicalization tag” – true story!