SEO Services
Home >>
Webmaster Forums >>
Duplicate Content and Dynamic PagesDuplicate Content and Dynamic Pages
Darren said: "Google comes along and indexes every URL it can find, even temporary ones. They usually place these in their "Supplemental Results".
Lately they're punishing duplicate pages even more than usual, so if you have a dynamic site, you want to clean up the URLs.
My plan with this forum software is to do the following:
1) Disallow Google from crawling all dynamic pages
2) Rewrite the Forums and the threads into static-HTML appearing pages
This should distill the essence of this website down to what's completely important: the information in posts. I'll exclude all the normal "gook" associated with message board systems.
I'll keep this thread updated so we can see how this approach works."
AngelaCea said: "A better place to post this would be on the VB forum so other forum operators can discuss it with you BK.
They will have much more to offer than we will."
edwin said: "have you checked the logs yet? does googlebot like those rules?"
Darksat said: "Quick question, what would be more important to you, mod rewrite URLs or RSS syndication feeds?"
Darren said: "In the last six months I've had really good success with rewriting URLs.
The duplicate content thing, I don't think it was as big of a deal as it is now. In the last few Google updates they've been more agressive in determining duplicate content.
The problem was the PhpNuke rewrite module I had was a bit off, like most of their "hacks". The duplicate problem is because all these are the same:
/post1222.html -> rewritten URL
/modules.php?forums=p=1222
/modules.php?forums=p&lastread=now
etc. Those are all actually the same page. The first one gets picked up, the others are duplicates.
If I force Google to ignore indexing the [url=http://www.google.com/webmasters/faq.html]dynamic pages[/url] then only the rewrite ones should come in.
I think this will take a few months to accomplish."
Darksat said: "I also noticed you had sevral rewrite pages with the same content
eg
post1222.html
post1342.html
and
post1172.html
would have the same content for some reason.
PS those arnt actual URLs just examples.
It probably had to do with your mod rewite modifying
/modules.php?forums=p=1222
/modules.php?forums=p=1222&start=0
as 2 different static URLS"
Darren said: "That's why this should be the easy solution.
GoogleBot comes, checks robots.txt, see it's not supposed to take dynamic pages and[b] won't index any page with a ? in it.[/b]
Really, Google moves very slow. It shows pages for months after they're removed. They do not have a very up to date view of the Internet."