Duplicate Content
Duplicate Content, 2:00 - 3:30 EST
Shari Thurow, Grantastic Designs
Duplicate content referencing PRweb.com and cancer.org (National Cancer Institute) - is a press release considered duplicate content? No, linkage properties are so different on the two sites.
What is duplicate:
- Content properties
- Linkage properties
- Host name resolution
- Domain name: bmw.com
- IP address: 192.109.63.43
- Host name: origin.bmw.com
Some duplicate content is considered spam:
- WebMD made a lot of different sites - created depression-101.com - hostname resolution - same - downgraded to Supplemental Result (look for Supplemental Results)
- If you know that your CMS is delivering duplicate content, use robots exclusion protocol and 301 redirects
- Use web analytics software
- Don’t exploit search engines
Jake Baille, President, TrueLocal
Jake as always is a wildly funny presenter. His presenter concept was Dr. Phil as he is a straight shooter, but did disclaim that Dr. Phil did not in any way endorse the presentation.
Top 6 Duplicate Content Mistakes
- Circular Navigation
- Print Friendly Pages
- Inconsistent Linking
- Uh - oh screen went too fast
Cause: Multiple paths through a website
Brand-category-product first path
category-brand-product - second path
brand-color-product - yet another path
What’s the fix?
Define a consistent way method of addressing a page of content irrespective of navigation path. Track paths through cookies.
Print friendly pages aren’t so friendly
Fixes - block search engines from search engine friendly pages
Use CSS/JS to restyle a page on the fly to make it print friendly
Your link isn’t working fo you anymore
Cause:
Calling directory index pages (including the homepage) in an inconsistent manner
/directory
/directory/
/directory/link.asp
Looks Don’t Count:
Product pages with nothing differentiating product page excedpt a lone SKU
Fix: Add content
It’s not always good to be transparent
Cause:
Badly implemented mod_rewrite code. DNS errors with multiple domains
Fixes:
Domains should be redirected (301) to the main site not DNS alias
Pick a canonical form to accesss content and stay with it
If the Suit Doesn’t Fit, Don’t Wear It:
Cause: Poorly written (or implemented) cloaking scripts serve the same doorway page over and over again.
Fix: Don’t use cloaking scripts you didn’t write, Make sure your cloaking script is returning separate content for each URL being cloaked.
Golden Rule by Jake or Dr. Phil:
The same content should never be accessible from different URLs, ever!
Rajat Mukherjee - Yahoo on Duplicate Content
Matt Cutts - Google on Dup Content - Some site owners are not really guilty of serious dup content issues. Site owners who have 2500 different top level domains, now that’s an opportunity for duplicate content penalty. Sites that come from different regions (google.com, and google.fr) are not guilty of dup content. What about www.version.com versus http://version.com? Absolute links are better to avoid people scraping sites. RSS content or content of your site being stolen - add a copyright statement to the bottom of the blog. Big Daddy infrastructure will allow site owners to determine if people want the www.version.com versus the http://version.com. Another tool to be aware of is the Google Sitemap tool. Take robots.txt files out for a test drive on the new Google Tool. Shows you how Googlebot would respond to your site and its particular pages.
Look up Shingling Techniques
Copying, Duplicating, Scraping - Define
RSS Scraping - you can syndicate this content and now someone can easily take advantage of your feed. Time sensitivity of the scraping of the data feed in RSS format - could create syndicate content in a matter of minutes.
Two ecommerce sites have the same sku? Attendee asked if it’s a "tip-off"? A small amount of information like a similar SKU is not enough to trigger duplicate content. Yahoo now has Site Explorer for webmasters to provide dynamic content similar to Google Sitemap.
I begin to wonder if this is an issue for one of our ecommerce sites, who has two home decor sites, one of which offers one type of home decor, and the other site offers that home decor plus much more. The second site is not doing well in the Google organic listings. It’s doing great in Froogle but not in the Google organic listings. I wonder if we compared each page on the two sites, and saw which ones are actually ranking okay, if it has anything to do with duplicate content issues.
Example: Craig’s List - make sure there is real content that is unique.
Question - about hidden links? If content is useful for users, it should be visible. Javascript if used for redirect, or to hide content, then that is not a good thing.

