How to slow down content thieves

Posted February 20th, 2010 by Michael Janzen and filed in How To

I really hate it when people steal content, especially when they hotlink to my images and eat up my server’s bandwidth too. This is a real form of theft and ends up costing me money because my biggest blog sometimes costs me overage charges.

I found this great solution at DavidAirey.com (Thanks David!) that replaces stolen images with an advertisement for your site. It’s still an image hosted on your website but the smaller file-size image is not as bandwidth intensive and the value of stealing your content is greatly reduced.

The solution is simple, edit the .htaccess file with a few lines of code and upload a replacement image to your website. Here’s the code you need to add to your .htaccess file:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(.+\.)?mywebsite\.com/ [NC]
RewriteCond %{HTTP_REFERER} !^$
RewriteRule .*\.(jpe?g|gif|bmp|png|jpg)$ /assets/nohotlink.jpe [L]

Be sure to replace mywebsite with your domain name and name your replacement image .jpe since you are about to block the common file types. In this case the folder the image is stored in is called assets.

The only problem with this solution is that it blocks services you may want to hotlink to your images like feedburner, facebook, and affiliates who are advertising for you. So another way to block the bad guys is to only block the offenders. Here’s sample code for ding that:

RewriteCond %{HTTP_REFERER} ^http://(.+\.)?domain-to-block\.com/ [NC]
RewriteRule .*\.(jpe?g|gif|bmp|png|jpg)$ /blog-assets/nohotlink.jpe [L]

Here’s a screenshot of a recent abuser’s site with my replacement image proudly displayed. From now on every site that try’s to post my content on their site will be blocked this way.

I found many other stolen articles and hosted images from my tiny house design blog as well as form many other sites. My best guess is that this blog most likely belongs to someone trying to making money with Google Adsense by building huge quantities of stolen content, which is a direct violation of the Google AdSense Terms and Conditions.

How to stop your content is being stolen

I find content/image thieves by setting up Google News alerts for my name, website name, and other keywords. I also review my server logs from time to time and look for big bandwidth hogs like out-of-control bots and spiders.

To help control scammer traffic you can also edit your .htaccess file with some simple commands. You can also use a robots.txt file, and should, but it will not stop the scammers… they just ignore it. Here’s some code you can add to block specific abusers. The one listed here is just an example of one site I have personally experienced scraping content from one of my blogs.

<Files *>
order allow,deny
allow from all
deny from 97.74.183.1
deny from .*eggtea\.com.*
</Files>

Report them and end their gravy train

Another thing you can try is to contact the offending site’s web host. Reputable web hosts will often shut down spam sites and blog farms when tipped off to the illegal activity.

To find the web host you can often do a DNS Lookup on the domain name. This is not 100% fool proof but it often points you directly to the host. Just do a search using a tool like DNS Watch and find the name servers listed in the DNS record. These are usually the web hosts name server. Then email the host at abuse@their-domain-name.com and report the content theft.

If the offender is a Google AdSense you can report them for terms of use violations. Just view the page source and grab their google_ad_client number, then visit Google Adsense and report them. If they are posting your stolen content on Facebook you can report them for copyright infringement.

Is it worth doing all this?

The image blocking trick is definitely worth while because is makes you less of a target. The rest of it is just a variety of countermeasures you can launch against these attackers. The trouble is that it all takes time and probably is bad for your Karma… or at the very least a waste of your time. But if you can keep the really bad abusers at bay and slow them down as they pick your pockets it can literally pay off in avoiding bandwidth and server load overage charges.

One Response to “How to slow down content thieves”

  1. Eric says:

    Great article

Leave a Reply