Again: Spam on Drupal Forms
A couple of weeks ago I was very optimistic about using Mollom as my primary anti-spam solution. However, I recently had a site where 14’000 comments were posted in a matter of 3 days, despite the fact that Mollom was enabled and should have prohibited exactly that.
Turns out that the HTTP request Drupal launches to receive a captcha, which is Mollom’s fallback solution for comment forms and the only way it protects the login/password/registration forms, returned a network error and thus, the comments were still posted. And yes, I configured Mollom to allow comments if the service fails. Why? Because the user experience is what counts, and blocking legit comments is the last thing you want to happen.
The only solution was to disable comments entirely for the time being. I don’t know the reason for the network errors, but it’s THE big problem with off-site third party services which provide - essentially security - services to your website via network calls. No uptime warranties, nor for the service itself nor your own network. Everything can possibly go down, the chances of that happening are actually not that small. Of course I acknowledged the possibilities before I set Mollom up, but .. oh well.
But what do I do now? I really hate captchas and I’m a fan of handling such things even before PHP uses any memory. Because once you get hammered by spambots, you will notice quite a bump in memory and CPU usage, simply because those pesky bots just don’t know when to stop. Something which could be doable using http:BL and an Apache mod. However, http:BL has a couple of false positives floating around (for example, the whole mobile/gprs network of Swisscom, the major swiss ISP ends up on the list all the time, again and again..).
On a side note, the site which got hammered filled up my database due to Drupal filling up a cache table, cache_form, because each comment can have another comment form attached (reply) and each form is being cached in the DB. That was ugly: filled up the partition in a matter of 2 days. That’s a no-go. And that’s also why I came up with the following, custom tailored solution:
Honeypot is another Drupal Antispam module. It’s userfriendly (read: no captchas) and it catches more than 99% of spam according to my recent testing. It works in two ways: First, it denies POSTs if the form has been loaded for less than X seconds. Second, it adds a hidden field to forms (the “honeypot”) which should stay empty, because a human user cannot see the field, therefore isn’t able to fill anything into it. Honeypot allows every attempt to be logged to watchdog. Which gets interesting, because now you have an IP and proof of a failed SPAM attempt.
That’s where CSF comes into play: CSF is a firewall setup which integrates nicely into WHM/cPanel. It allows you to add and remove IPs to temporary or permanent blacklists. All you need now is a bit of glue: A rule which fires 1 line of PHP, writing the IP from the watchdog entry to a temporary file on the server and a CRON job which splits the temporary files into single CSF.deny entries and calls CSF’s perl script to add the IP to a blacklist, either temporary or permanent. The cool thing about that is that you don’t just get comment form spam attempts logged, but also bots trying to register an account or resend a password - not major security issues, but still something you really don’t need to happen in the first place.
Once an IP ends up on a temporary or permanent blacklist, all further attempts of this IP don’t make it past the firewall, haveing no impact on the webstack. And all of this without any network calls, DNS calls to http:BL, and so on.
So in short: a rule with 1 line of php code to fire everytime honeypot adds a watchdog message, a temporary logfile every http-user can write to and a little wrapper script which pipes the IPs to CSFs blacklist. Pretty simple compared to all the huzza of implementing external services. I packaged the rule into a feature and the result is a module which I can turn on and off on all my hosted Drupal sites at once with one CLI command (I got my little toolkits for that..). Removing an IP can be done either via CSF’s WHM/cPanel Plugin or on the CLI, too.
Another Drupal module I had my eyes on recently is an implementation of the hashcash algorythm. But I yet have to figure out if those methods can actually co-exist. That would add another layer of protection without having to resort to captchas. In general, it comes down to bulding a stack of multiple, user-unobtrusive spam prevention measures. That should give a fairly high level of protection while not haveing to annoy anyone to try and read ugly, distorted letters.
