THE BLOG

Featuring fresh takes and real-time analysis from HuffPost's signature lineup of contributors

Tim Berry Headshot

We Need Wikipedia Scanner For Mayfloggers

Posted: Updated:

Like so many others I'm gloating over Virgil Griffith's new Wikipedia scanner application unmasking the wikipedia whitewashing of corporate, government, and PR interests. Bloggers and journalists are having a field day with it. Technorati shows 762 blog posts on it as I write this. Google shows more than 320,000 web pages. And the wikiscanner home page has an honor role of mainstream news organizations picking up the story.

Go Virgil, well done. Take a deep breath and well-earned pause. Then go after mayflogging.

What's mayflogging? I was hoping you'd ask. It's what Wikipedia calls blog scraping. Copy a blog post, put it onto a temporary blog immersed in ads, and put up a flock of one-day parasite blogs to link to it and move it up in the web searchers. Catch some web searchers, get some clicks, make some money, and disappear. I say we should call it mayflogging. That's "mayfl" for mayfly, and "ogging" for blogging, and the additional "flogging" in the middle because it sounds painful and it should be. The mayfly is an irritating bug that lives for just a day or so and spoils an occasional warm May afternoon by a lake. I'm glad they don't bite but they do fly around your head and irritate the bezeezus out of you, which of course is what these mayfloggers do. The illustration here, from Wikipedia, is a gang of mayflies covering a truck. Click on the image. It doesn't show well here small. It's ugly. So it's a lot like mayfloggers with a post.

2007-08-26-mayflogging.jpgIt can get ugly very quickly. For example, the search results here to the right, from a Google blog search done today. Every one of the references you see takes a column first published at entrepreneur.com and repeats it word for word, without permission, as bait to get people to go to a temporary blog and click on some advertisements. It's like radar jamming, dozens of listings with those search terms embedded in the copy but no content except clicks for ads, or some article stolen from some other website and repeated ad nauseum to generate click through.

Why do you care? First because this is commercial pollution, it fouls blog searches with sludge. Think of the swamp gunk fouling up those Internet tubes. It eats up infrastructure and interferes with searching. Second because they're stealing your time. Third, if you're posting at all, they are stealing your content. Yes, as you've probably already guessed, I wrote the the entrepreneur.com article they're stealing in my example here.

Here's how it works: You write a piece and publish it or have it published somewhere. It looks like it will attract people looking for that topic in the search engines. These blog scrapers pick it up and copy it onto a blogs, surrounding it with ads. Then they create, using software designed to make that simple, dozens of instant blogs using a free blogging tool like Google's blogger. Those new blogs link to the target ad-soaked blog and push it up in the searchers. Then when normal people search for "sales forecast" they find the ad-soaked blog, which is using your work without permission, and, in many cases, against your interest. You probably don't like to see your soulful post surrounded by trashy ads. And when it comes right down to it, nobody asked you, but it's your work they're using.

I've seen my work surrounded by all kinds of creepy ads. Usually its canned rehash business plans, just a useless twist on sample plans; but last week I saw my sales forecast article surrounded by ads for singles, supposedly available women in my home town. That's a strange piece of sludge marketing, to be sure. I'd like to object, but, like the mayfly that lives annoyingly for just a single day, these blogs are gone two days later. Maybe Virgil can make the mayfloggers accountable.

Cnet has a piece this month on how blogger Lorelle Van Fossen reacts to mayflogging. They call it "please don't steal this web content."

Lorelle VanFossen is passionate. An author, travel writer and nature photographer, she also has a popular blog about, well, blogging. Her pet peeve is online plagiarism, which she encounters nearly every day.

"It's one of my favorite subjects," she said. "I make my living from my writing, and when people take it because they are ignorant of copyright laws--or think that because it's on the Internet, it's free--it makes me really mad. It's stealing content, in my mind."

What's to do about it? Governments obviously can't help. Google is apparently working on it, but it's a tough case because google's free blogger facility is one of the frequently used tools of the mayfloggers. And how do they distinguish between an interesting new blog and a mayflog? Should the Google blog search algorithms discount something for just-created blogs with only a couple of posts? The techorati blog search does something like that with what it calls an authority rating for a blog, but the mayfloggers get into Technorati too.

Unmasking the mayfloggers won't solve the problem, but it would help. I'm rooting for Virgil Griffiths, or anybody else out there with that kind of combination of know-how and initiative.