Duplicate content refers to content that appears on the internet in more than one location. There are millions of Web pages on the internet and when you are performing a search using a search engine such as Google, it tries to find the most relevant results for your query.
Google has an algorithm (a set of rules to be followed) to weed out any duplicate content for pages found in search results. The part in the algorithm responsible for the quality of content is named “Panda”. Websites that are written uniquely with little or no duplicated content and have high visitor engagement (low bounce rate, very informative, provide answers to the search phrase, etc.) are promoted in Google’s top search results (rankings).
The risks of having duplicate content on your web site is that it can get penalized by Google. This means that pages with duplicated content (and eventually your entire website) will show up in Google search results less and less which leads to decreased traffic and can eventually be considered spam and then removed from the Google index entirely (site wide penalty).
Ways to avoid duplicated content include; not posting the same information or article on more than one of your own websites, social media platforms, video sharing sites like YouTube and even more of a common case would be using the same content on sites such as LinkedIn. Never copy or paste any content on to a website or page where the majority of it will be duplicated. It is okay to quote some content but the bulk must be uniquely written. If you know that there is text on your website that has been copy/pasted from another of your websites or anywhere else , they must be rewritten or deleted. Having thin content which means extremely short blogs or articles with uninteresting information will affect your ranking as well, they must be rewritten and/or beefed up with engaging content.
A few examples of duplicate content could include;
URL Parameters – when tracking analytics through URLs, the same page may be accessible via different URLs and that’s when duplicated content can emerge. Having multiple URLs can diminish a links popularity. Having long URLs with tracking numbers may also decrease the chances of a user selecting that result.
Printer-only Versions – if a website has a regular and a printer-friendly page for each article and neither of these pages have a noindex meta tag, then the search engine will not know which one to show and eventually one of the pages can be filtered out due to duplicated content.
Session ID’s – when tracking visitors on your page, sometimes a “session” is created so a brief history of what the visitor did between pages can be stored. So each visitor has a session ID and each page they click creates a new session ID. If you have 20 pages on your website, and 20 visitors a day, your website now has 400 indexible URLs.
There are a few different ways that content can be duplicated and affect the traffic and analytics of a website, but there are also solutions. Google’s algorithms will most often only choose one URL in the results, so if you have a preference, let Google know by;
301 Redirects – if you have a website that is www.example.com, and it is switched to http://www.example.com they are perceived as the same URL but considered two different web sites. A 301 will redirect visitors from one to the other to help you get the best leads when doing this the pages are not competing but working cohesively to create a stronger relevancy and popularity.
Rel=”canonical” – you can insert a canonical link that is a soft redirect as a quick fix for duplicate content. It is slightly slower than a 301 redirect. Usually used when you have a duplicate version of a page but the URLs differ. This tag groups them together.
Duplicate content happens often, and between the millions of pages on the internet there is bound to be similar information and sentence structure across a few thousand. By using the fixes mentioned above and also having Google Webmaster as a tool, you can be sure that your website ranking will rise by getting rid of any duplicated content.