Home » Technotainment » Duplicate Content: What, When & How

Duplicate Content: What, When & How


Generally the questions posed regarding duplicate content show the kind of confusion and rumors that haunt the webmasters. In one of the recent conferences that took place in Chicago, the following problem was aptly dealt with. Here are few points that show throw light on this aspect:

Definition of Duplicate content: Chunks of Content that are in or across the domains which may completely match some other content or are similar is referred to as Duplicate Content. In most cases, this is not intended and is not malicious in its origin like the forums that generate both stripped down targetted pages and regular ones, linking of or store items shown through multiple distinct web addresses etc. Sometimes in order to achieve search engine ranking or to gain more traffic, the content is duplicated across the domains.

Content that is not considered Duplicate: Articles that are written exactly in two different languages like English and German is not considered as duplicate. Also, occasional quotes etc is also not flagged as duplicate content.

The reason why Search Engines take duplication seriously: The purpose of Search Engines is to throw out unique and diverse content when a particular user searches for some information. Duplication of content across the web would compromise this intent. The need of Search Engines would eventually be lost if this is not controlled.

How do Search Engines resolve this: SE try hard to distinguish and index information that is unique. For example, the filtering is done in such a way that if your site has articles are in regular and printer versions and none of them are blocked by using either no index meta tag or robots.txt, then Search Engine chooses only one version to list. In few cases where SE perceives that duplicate content is shown with an intention of manipulating the rankings and hence deceive its users, it makes necessary adjustments in ranking and indexing of the sites involved. As such SE concentrates more on filtering than rank adjustments. Therefore, in any circumstance, for a webmaster who duplicates the content on his site, he would see his page falling down in index and eventually banned for it.

How to proactively deal with Duplicate stuff
Block appropriately: Instead of SE algorithms deciding which version to select, its better that webmasters show the SE which one to select. Just disallow those versions that you don’t want SE to index.
Make use of 301s: Make good use of 301 redirects in your .htaccess file to aptly redirect users, the SE bots and other spiders; if you are site is restructured.
Internal Linking consistency: Endeavor to keep your internal linking consistent; don’t link to /page/ and /page and /page/index.htm.
Make use of TLDs: use top level domains wherever and whenever possible to handle country related content. This will help SE serve the right version of your information. We’re more likely to know that .us indicates US-focused content, for instance, than /US or US.example.com.
Careful Syndicate: If you put your content on other sites, ensure that they have a link back to your original article. Even with that, note that SEs always show the (unblocked) version which it thinks is most appropriate for users in each given search, which may or may not be the version you’d prefer.
Use the preferred domain feature of webmaster tools: If other sites link to yours using both the www and non-www version of your URLs, you can let SE know which way you prefer your site to be indexed.
Minimize boilerplate repetition: For example, Instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details.
Avoid publishing stubs: Users don’t like seeing “empty” pages, so avoid placeholders where possible. This means not publishing (or at least blocking) pages with zero reviews, no product listings, etc., so users (and bots) aren’t subjected to a zillion instances of “Below you’ll find a superb list of all the great Products in [insert productname]…” with no actual listings.
Understand your CMS: Ensure you know how the content is displayed on your Web site, particularly if it includes a blog, a forum, or related system that often shows the same content in multiple formats.
Don’t worry be happy: Don’t fret too much about sites that scrape (misappropriate and republish) your content. Though annoying, it’s highly unlikely that such sites can negatively impact your site’s presence in Google. If you do spot a case that’s particularly frustrating, you are welcome to file a DMCA request to claim ownership of the content and have SE deal with the rogue site.

To conclude, proper awareness of duplicate content issues and little time of maintainence should make you help SE provide better user experience with unique and relevant content as well as make your site do better on ranking and listing!

Short URL: http://zakpack.com/?p=23

You can follow any responses to this entry through the RSS 2.0. You can leave a response or trackback to this entry

1 Comment for “Duplicate Content: What, When & How”

  1. wow nice share of information. it means copy paste decline page ranking.

Leave a Reply

© 2010 Zakpack.com. All Rights Reserved. - Powered by Start a website - Designed by Imran Ali