Dealing with form spam

Let’s start by looking at the timeline of a form submission, and then we can look at what can be done to combat spam at each section of the timeline.

  1. A user (or a bot) visits a page on our site.
  2. The user (or bot) fills out and submits the form.
  3. Our system processes the form submission.
  4. A message is displayed to the user (or bot) who filled out the form.

Now let’s look at our options:

1. How to stop spam as soon as a user (or a bot) visits a page.

If the visitor is a human, this is probably not a good time to try to block them, because they will most likely be indistinguishable from a normal user. However, if the visitor is a bot, there are a number of services which could potentially block the bot before it even attempts to fill out the form. Cloudflare bot protection is one of the most well known services that attempts to block bots before they even reach the form.

2. How to stop spam while the user (or bot) fills out and submits the form.

There are two common methods for stopping spam during the form submission itself:

  • Honeypots. Honeypots are form elements that a bot will fill out differently than a human. Usually this is done by using an input field that is hidden from view, but that a bot will fill out because it’s looking at the site code, not the visible site. Honeypots might work against unsophisticated bots, but don’t typically pose a problem to more advanced bots.
  • CAPTCHA. A CAPTCHA will present the user with some sort of test to try to determine if they’re human or not. The most popular CAPTCHAs are ReCAPTCHA by Google, Turnstile by Cloudflare, and hCaptcha by intuition machines. ReCAPTCHA worked well for me years ago, but as time has passed more and more bots have learned to bypass it. Turnstile has worked pretty well for me in the limited number of places I’ve used it. And hCaptcha…I’ve never used it, so I can’t comment on its effectiveness.

3. How to stop spam while processing the form submission

The more information we’re collecting in our form, the more data we have to work with when attempting to remove spam. Sadly, the more data we ask for, the fewer people will finish filling out the form.

So what can be done to recognize spam after the form has been submitted?

  • Out of place data. If there is a URL in a field meant for a first name, it’s probably spam. If there’s an email address in a field meant for a mailing address, it’s probably spam.
  • Fake email addresses. There are services that can check if an inbox exists for the provided email address.
  • Content. There are also services that can look at the content of a message and guess whether it is spam. However, for good results, the message needs to contain enough data to make an educated guess. If it’s just a name and an email address, there’s not much to work with.

4. How to stop spam when the confirmation message is displayed.

This doesn’t directly address the spam issue, but it should be noted that giving detailed feedback about why a form submission was rejected can help train bots to bypass your safeguards. “Your submission cannot be processed right now, please try again later” doesn’t tell the spammer/bot what went wrong, so it doesn’t know what to try next. “I’m sorry, that is not a valid email address” tells a spammer/bot exactly what they need to do differently to get past the anti-spam measures.

 

 

Posted in

Leave a Comment

Your email address will not be published. Required fields are marked *