contentscrapingandyourblogAs a blogger, I’ve long had my eye on content scrapers, which are people who steal content from your website and, with little or no changes, put it on theirs.

Search engines love blogs so in an effort to get fresh website content that people are looking for, content scrapers hope to take the work that someone else put into writing and use it for their own gain.

I used to pay attention to this when I was a daily blogger but have since stopped, in small part because it made me mad that someone copied my work but in larger part because I realized the people who were doing it were desperate and probably not going to get very far. I have bigger fish to fry.

Matt noticed a few weeks ago that some websites, when he copied a snippet from them, added a URL back to the original post. Of course, I hadn’t noticed this so I immediately looked into it.



Here’s an example of what I mean.

Let’s say I go on the Time’s website and find this article:

articleintime

And in this article I found something I wanted to save, somewhere. I highlight and copy the text:

articlewithquotehighlightedtime

 

When I go to paste that text, here is what happens:

If you want some insight into why the Department of Justice put a gate hold on the merger between American Airlines and US Airways, here’s a number to ponder: 13 million seats—gone. That’s how many airplane seats have disappeared over the past year—seats removed from the system by the airlines as they reduce capacity.

Read more: http://content.time.com/time/magazine/article/0,9171,2150624,00.html#ixzz2dw7wHcn4




As you see, the quote is preserved but an attribution link with tracking code is automatically added.

I thought this was kind of neat so I installed Tynt on my site.

Something I didn’t think of when I did this was the fact Tynt could now track when content left my site:

tyntreportcontentcopying

 

Now as you see most people who copied text from this site (26 this week) just deleted the tracking link when they copy… but 6 didn’t and people got to my site from it. Interesting.

What’s my most copied article you ask? Here it is: https://breakingeveninc.com/the-pros-and-cons-of-google-apps/ (At least 2-3 copies a week, who knew it was that good?)

So what’s my point in all this? If you write online, there is a pretty good chance someone is using your content (at best, getting information and attributing and at worst, stealing). But there are tools that can allow you to not only measure this but to make it a little more annoying to do so.

Have you created a pdf or some other piece of content people can download? Put your watermark on the bottom then have some fun like my friend Peter did:

someonecopyingpeterswork

If you really want to go after an offender, this blog post is a pretty detailed how to: http://blog.kissmetrics.com/content-scrapers/

This blog post, however, is more to tell you that it can be an amusing past time to watch where your ideas go… and that I’m going to keep my eye on Tynt for awhile longer.



Our first in-person workshop in 2+ years is happening September 24!

X