Protecting your Website from Content Duplication Woes
Online websites or Internet properties as they are often called have gained an immense amount of popularity and significance over the years since they have emerged as the primary source of income and livelihood for millions of e-commerce entrepreneurs and their employees from all over the world. Internet investments tend to appreciate in value and depending on the industry, visitor traffic count and annual sales volume, websites can eventually emerge as highly lucrative tangible financial assets when the need to sell them arises. The secondary market for websites continues to remain active with hundreds of thousands of websites changing hands each year. If you are the owner of your own e-commerce website, information portal or online content network, you should consider spending the same amount of time protecting your online asset as you do trying to promote it.
Website Components that call for Protection
There are essentially four components you need to protect with regard to your website:
- Your domain name
- Your graphics
- Your IP if you have a dedicated IP
- Your text content
Protecting your domain is easy. Simply declare it as a private domain by paying your domain registrar an additional annual fee and take your contact and other information into the private realm. As long as you do not share your IP with other websites (and we have strongly recommended through features in this newsletter that you not share your IP with any other website), your IP is safe. If you are using the services of a search engine optimization company or have taken your SEO functions in house, ensure that your SEO does not indulge in black hat SEO practices. This will further protect your IP and your website will continue to remain in the good books of all the search engines.
Protecting your website content, which includes both written content and graphics content, is a whole different story. According to a recently published estimate, there are about one trillion web pages on the Internet today and only 250 billion of them (about 25%) have been indexed by Google and the other search engines. In other words, 75% of the Internet continues to remain unindexed today. Hence there are plenty of opportunities for cyber miscreants to copy your content and use it for their own gain. There are a few things you can do to protect your investment, which we have briefly outlined here.
Validating your Content and its Uniqueness
There are several tools available which can help you trace your content to ensure that it is unique to your website. They come in several versions and have different attributes. Copyscape for instance, available in a free web-based version on www.copyscape.com, can verify the uniqueness of online content as long as it has been uploaded on to a website. Once a URL is submitted, the program compares it against millions of web pages and displays any pages that resemble the submitted page. The paid version of Copyscape, on the other hand, can process text even if it does not happen to be web enabled. Dupe Free Pro, which is currently available as a free download from www.dupefreepro.com, is a dedicated desktop application. Submit a page of content to Dupe Free Pro and the software will go looking for duplicate content on the Internet. We recommend that you consider using both programs to improve the possibility of achieving reliable results.
Checking for Duplicate Graphics
Like textual content, graphical content in the form of images, pictures, logos, designs and sketches too are copied freely on the Internet. Once again, there is not a fail-proof method of locating the pirates. However, the image search features of the major search engines are beginning to help. Click the “similar image” link when you search for your image on the Internet and the major search engines will show you the web pages on which your graphical content has been copied without your permission. In order to use this mode of detection, the images on your website should have been indexed by the search engines. We recommend that you consider alt tagging all your graphical content so that the search engines can easily index your graphical content. This is a worthwhile SEO strategy as well.
Reverse Image Search
Recently, a website set up exclusively for the purpose of detecting duplicate image content called TinEye has been launched. Located at www.TinEye.com, this reverse image search engine is usually able to track your images on websites not authorized to display them. The service is relatively new and in due course when the website’s database grows, the results will become even more reliable.
Combating Duplicate Content on the Internet the Sure Shot Way
If you happen to locate duplicate content on the Internet that has originated from your website:
- Write to the webmaster of the offending website and ask him or her to remove the content immediately.
- If you are unsuccessful, write to the hosting company and the domain registrar by conducting a DNS look up. To obtain this information, visit www.whois.org and type in the URL of the offending website.
In most cases, your content will be removed immediately and you will be able to enjoy the benefits of your labor without experiencing any infringement.
|