That tells the search engines that the first page is just another view of the second page you have listed. The search engines know they are the same page, and so do not treat them as duplicated content. Only the second page will be indexed.
The tag is supported by all major search engines. They do say that: "It's a hint that we honor strongly" (google). They say this because they do check that the content is the same or roughly the same (minor differences) otherwise it could be abused.
@bibosk8 what are the issues you have with this? The same content appearing in many pages, and the canonical tag pointing to the source page, is pretty much the standard way of handling content these days, and it works very well in my experience.
I use it, for example, in jobs sites, where the same job will appear in many pages, under many categories and index pages, but ultimately all those pages point to one source page or "master page" where the job can be read and indexed. Google honours the canonical tag as expected, and only those main pages appear in Google's index. Putting the canonical links into the sitemap file (e.g. http://www.iema.net/jobs/sitemap.xml) also helps direct the search engines.
The user of rel="nofollow" I think, is not appropriate in this case. The nofollow attribute is really there to say, "hey search engines, whatever this links to, has no association with this site".
I think I may be confused. Please correct me if I'm wrong. Are you saying that links in vanilla using the standard default theme will have P1 indexed by search engines? I would prefer that links without P# at the end indexed.
Yes, I believe that is what would happen. Whether the permalink (i.e. the canonical page URL) for page 1 using the page number suffix (.../p1) is the desired page for indexing or not, I guess is another issue. Should we index .../p1, .../p2, .../p3 etc. or just lose "p1" and retain p# where # >1? I don't know.
One thing is for certain, when a discussion is more than one page long, then it will have multiple pages (all different), and it does make sense for the first page to have a slightly simpler URL for neatness.
Also, if there is no other link to the "p1" page, then the search engines may not actually be able to find it. Just because it is defined in the canonical tag, it does not mean the search engines will go there. I'd personally be inclined to drop the page number for page 1 every time, no matter where it is referenced.
The p# is the page number, so they do not contain the same content. p1 may have comments 1 to 20, and p2 would have comments 21 to 40, and so on. To get a whole discussion into a search engine, all the pages would need to be indexed.
p1 and p2 may be identical on this forum, because the page length seems to be set to some enormous value. This discussion, for example, has 70 comments, and it is still all on one page.
I agree that "p1" is a little redundant, and so could always be left out (i.e. assumed) but the remaining pages would need to be available for indexing under their own unique URLs.
The targets (or fragments, i.e. #whatever) are probably irrelevant in this case, since targets are never sent to the server in a page request and search engines ignore them completely (at least, they do for now).
Just out of interest, the canonical for this page is:
which has a redundant page number on the end. This a danger, in my experience, with piecing together URLs all over the place. Different bits of code and plugins apply slightly different rules when constructing URLs, and they tend to fall out of step. It means that no matter how many places you clean up the "p1" suffixes, something somewhere will be overlooked and will create them, so they are always there to be contented with by the search engines.
An approach I've used successfully on other CMSs, is to pass all the URL data into a single core function, and have the URLs constructed there. Plugins and modules can then offer services to that central function or method to apply rules in constructing the URL path. Any parameters that don't fall into a rule then just get added as named GET parameters.
I put the domain http://mydomain.com/categories/features in the browser and check the canonical tag, and it is showing 'http://mydomain.com/categories and it's not ending with the category name. Any reason why. It happens to all categories. Using the latest version of Vanilla.
Comments
<link rel="canonical"
href="http://vanillaforums.org/discussion/13863/vanilla-2.0.16-released/p1" />
That tells the search engines that the first page is just another view of the second page you have listed. The search engines know they are the same page, and so do not treat them as duplicated content. Only the second page will be indexed.
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •But I think is not good this method.
Where I can change this permalink for the original?
Thanks!
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •I use it, for example, in jobs sites, where the same job will appear in many pages, under many categories and index pages, but ultimately all those pages point to one source page or "master page" where the job can be read and indexed. Google honours the canonical tag as expected, and only those main pages appear in Google's index. Putting the canonical links into the sitemap file (e.g. http://www.iema.net/jobs/sitemap.xml) also helps direct the search engines.
The user of rel="nofollow" I think, is not appropriate in this case. The nofollow attribute is really there to say, "hey search engines, whatever this links to, has no association with this site".
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •I think I may be confused. Please correct me if I'm wrong. Are you saying that links in vanilla using the standard default theme will have P1 indexed by search engines? I would prefer that links without P# at the end indexed.
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •One thing is for certain, when a discussion is more than one page long, then it will have multiple pages (all different), and it does make sense for the first page to have a slightly simpler URL for neatness.
Also, if there is no other link to the "p1" page, then the search engines may not actually be able to find it. Just because it is defined in the canonical tag, it does not mean the search engines will go there. I'd personally be inclined to drop the page number for page 1 every time, no matter where it is referenced.
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •this original
http://www.example.com/this-is-the-one.html
and not
http://www.example.com/this-is-the-one.html/p1
http://www.example.com/this-is-the-one.html/p2/#another-referece-123
Hopefully someone can clarify for the default theme included with 2.0.16
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •p1 and p2 may be identical on this forum, because the page length seems to be set to some enormous value. This discussion, for example, has 70 comments, and it is still all on one page.
I agree that "p1" is a little redundant, and so could always be left out (i.e. assumed) but the remaining pages would need to be available for indexing under their own unique URLs.
The targets (or fragments, i.e. #whatever) are probably irrelevant in this case, since targets are never sent to the server in a page request and search engines ignore them completely (at least, they do for now).
Just out of interest, the canonical for this page is:
http://vanillaforums.org/discussion/14064/dupe-content-for-the-search-engines-with-the-permalink/p1
and that applies even if you are viewing a specific comment, for example, this comment:
http://vanillaforums.org/discussion/comment/133428#Comment_133428
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •http://vanillaforums.org/discussions
which is great. But click on page 1 of the pager at the bottom of the screen, and you get this:
http://vanillaforums.org/discussions/p1
which has a redundant page number on the end. This a danger, in my experience, with piecing together URLs all over the place. Different bits of code and plugins apply slightly different rules when constructing URLs, and they tend to fall out of step. It means that no matter how many places you clean up the "p1" suffixes, something somewhere will be overlooked and will create them, so they are always there to be contented with by the search engines.
An approach I've used successfully on other CMSs, is to pass all the URL data into a single core function, and have the URLs constructed there. Plugins and modules can then offer services to that central function or method to apply rules in constructing the URL path. Any parameters that don't fall into a rule then just get added as named GET parameters.
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •I have problem with Canonical URLs.
I put the domain http://mydomain.com/categories/features in the browser and check the canonical tag, and it is showing 'http://mydomain.com/categories and it's not ending with the category name. Any reason why. It happens to all categories. Using the latest version of Vanilla.
- Spam
- Abuse
- Troll
0 • Off Topic Insightful Awesome LOL •