Duplicate Content

Duplicate Content, 2:00 - 3:30 EST

Shari Thurow, Grantastic Designs

Duplicate content referencing PRweb.com and cancer.org (National Cancer Institute) - is a press release considered duplicate content?  No, linkage properties are so different on the two sites.

What is duplicate:

  • Also look at other domain/server properties
  • Shingle comparison (Andre Broder - read anything by him regarding shingles) — I’ve never heard of him before or shingles - it’s considered word sets
  • Some duplicate content is considered spam:

    Jake Baille, President, TrueLocal

    Jake as always is a wildly funny presenter.  His presenter concept was Dr. Phil as he is a straight shooter, but did disclaim that Dr. Phil did not in any way endorse the presentation.

    Top 6 Duplicate Content Mistakes

    Cause: Multiple paths through a website

    Brand-category-product first path

    category-brand-product - second path

    brand-color-product - yet another path

    What’s the fix?

    Define a consistent way method of addressing a page of content irrespective of navigation path.  Track paths through cookies.

    Print friendly pages aren’t so friendly

    Fixes - block search engines from search engine friendly pages

    Use CSS/JS to restyle a page on the fly to make it print friendly

    Your link isn’t working fo you anymore

    Cause:

    Calling directory index pages (including the homepage) in an inconsistent manner

    /directory

    /directory/

    /directory/link.asp

    Looks Don’t Count:

    Product pages with nothing differentiating product page excedpt a lone SKU

    Fix:  Add content

    It’s not always good to be transparent

    Cause:

    Badly implemented mod_rewrite code. DNS errors with multiple domains

    Fixes:

    Domains should be redirected (301) to the main site not DNS alias

    Pick a canonical form to accesss content and stay with it

    If the Suit Doesn’t Fit, Don’t Wear It:

    Cause: Poorly written (or implemented) cloaking scripts serve the same doorway page over and over again.

    Fix: Don’t use cloaking scripts you didn’t write, Make sure your cloaking script is returning separate content for each URL being cloaked.

    Golden Rule by Jake or Dr. Phil:

    The same content should never be accessible from different URLs, ever!

    Rajat Mukherjee - Yahoo on Duplicate Content

    Matt Cutts - Google on Dup Content - Some site owners are not really guilty of serious dup content issues.  Site owners who have 2500 different top level domains, now that’s an opportunity for duplicate content penalty.  Sites that come from different regions (google.com, and google.fr) are not guilty of dup content.  What about www.version.com versus http://version.com?  Absolute links are better to avoid people scraping sites.  RSS content or content of your site being stolen - add a copyright statement to the bottom of the blog.  Big Daddy infrastructure will allow site owners to determine if people want the www.version.com versus the http://version.com.  Another tool to be aware of is the Google Sitemap tool.  Take robots.txt files out for a test drive on the new Google Tool.  Shows you how Googlebot would respond to your site and its particular pages.

    Look up Shingling Techniques

    Copying, Duplicating, Scraping - Define

    RSS Scraping - you can syndicate this content and now someone can easily take advantage of your feed.  Time sensitivity of the scraping of the data feed in RSS format - could create syndicate content in a matter of minutes. 

    Two ecommerce sites have the same sku?  Attendee asked if it’s a "tip-off"?  A small amount of information like a similar SKU is not enough to trigger duplicate content.  Yahoo now has Site Explorer for webmasters to provide dynamic content similar to Google Sitemap.

    I begin to wonder if this is an issue for one of our ecommerce sites, who has two home decor sites, one of which offers one type of home decor, and the other site offers that home decor plus much more.  The second site is not doing well in the Google organic listings.  It’s doing great in Froogle but not in the Google organic listings.  I wonder if we compared each page on the two sites, and saw which ones are actually ranking okay, if it has anything to do with duplicate content issues. 

    Example: Craig’s List - make sure there is real content that is unique.

    Question - about hidden links?  If content is useful for users, it should be visible.  Javascript if used for redirect, or to hide content, then that is not a good thing.