Managing Comment Spam

Posted: December 8, 2009 In

spam.jpgComment spam seems inevitable. Every blog, podcast, or site that I've run that allowed comments turned into a place for comment spam. Over the years I've tried numerous methods to kill the spam and most of them either didn't work very well or stole too much of my time to deal with something annoying. After dealing with this problem for several years I now have a setup that manages most comment spam with little work on my part.

Mollom or reCapthca

All users you can comment who have not gone though some kind of account registration with email verification have a filtering mechanism. My go to solution is Mollom. It's a service that checks content and returns if the comment is spam (the bad stuff), ham (the good stuff), or if it's not sure. If it's spam it doesn't allow it, if it's ham the comment is posted, and if it's unsure a captcha is displayed. This keeps user annoyances at a minimum but still has some level of protection.

For cases where I want a captcha on every comment I often use reCaptcha instead. The captcha display is a little more slick and usable.

This kills most comment spam on newer posts where any conversation is actually happening. For Drupal the Mollom and reCaptcha modules provide integration with these services.

Closing Comments On Old Posts

Older posts are known to get comment spam. They've been linked to and might have good ranking results in Google. If you aren't following them anymore its a place spammers may be able to slip comments on posts where they may not be noticed by the site maintainers.

To deal with this I simply close comments on posts older than a month. Very rarely do useful comments come into older posts and the conversation that happens when something is posted is already gone.

The comment closer module in Drupal provides this.

Nofollow Attribute

If a spam comment or two gets through my setup and I don't catch it I don't want to help the spammers out. To deal with that all links in a comment have the nofollow attribute attached. This stops search engines from following the link.

In Drupal this can be done in the input filter configuration.

This setup keeps my work managing spam at a minimum and still allows for conversations to happen. What tricks work to keep spam down on your site?

Reader Comments

Thanks for the info on Mollom. I hadn't heard of it before this article. The free version would fit right in with the small non-profit (Drupal) sites we set up.

How does reCapthca compare to Egglue Semantic CAPTCHA? I like the Egglue Semantic CAPTCHA module, seems to do the thing for me, but my site get very little traffic. Perhaps a comparison article would be good. Does the nofollow attribute hurt contribution. I have heard that some people are offended if you use the nofollow attribute. They feel that they are contributing and should get something in return.

Seems like Mollom has trouble detecting what is not spam. When make this post and clicked preview it gave me a capthca test.

The Egglue Semantic captcha looks interesting but I would still prefer recaptcha or mollom.

Egglue suffers from a few problems.

First, its based on the idea of the english language. What about people who don't know english but instead know a different language? Is it localized to the operating system language? I don't see anything documented about this and being multilingual is important.

Second, this system is text based (or looks to be). Even with over 100,000 different patterns a system can easily be setup to crack it. And, once bot knows the system it would own all the captchas on all the sites using this.

This isn't the first time an idea like this has come up. There is a reason it's not really popular right now.

Heads-up!
Comments on this blog seem to be broken on this page: http://engineeredweb.com/blog/09/11/building-stack-overflow-clone-drupal... (on all pages of related posts actually).
Please check for us? - Thanks!

What is broken? You were able to post.