Tutorial #45
reCAPTCHA Intermediate   2014-01-24

Introduction

CAPTCHA is an acronym for Completely Automated Public Turing test to tell Computers and Humans Apart and is an attempt to ensure that only humans can submit certain web forms. These are the images containing text, often distorted, that you are asked to enter into a form before submission. They serve as a way to deter automated software from creating fake user accounts, submitting spam as comments, etc.

You may want to include a CAPTCHA widget in your applications at certain points that might be vulnerable to fake, automated submissions. One of the best known of these services is reCAPTCHA, which is now owned by Google, which not only challenges your users but also help digitize books as a side benefit.

The image that is shown to the user as a challenge has two words. One of these is known and the user response is expected to match this. The other is unknown and Google uses the response to this to help digitize documents. Capturing just a few letters at a time may seem a very inefficient way to do this, but when multiplied by all the calls made to reCAPTCHA every day the amount of text that can be digitized becomes very significant.

This tutorial shows you how to set up reCAPTCHA in your web forms and how to validate the user's response on your server.

Demo 1 screenshot for this tutorial


Step by Step

reCAPTCHA is free to use but you do need an API key from Google.

Go to Create a reCAPTCHA key and enter the domain that you want use the key with. You can enter a specific site (e.g. apprentice.craic.com) or check the box to create a global key that works on all subdomains of a domain (e.g. craic.com)

Image 1 for this tutorial

The page that you get back contains a Public Key which you include in your web page and a Private Key which you use in your server side script. Keep the Private Key Secret. It could be used to overload the reCAPTCHA server and result in your account being blocked.

Image 2 for this tutorial

Once you have your keys then you can add them to your own code.


Understanding the Code

This example involves both Client-side and Server-side code. The Client code fetches the reCAPTCHA image from Google, displays it in a special widget and captures the user's input. This is passed along with the rest of your form input to your server. The server side code takes the reCAPTCHA values from the form, adds your Private Key and the user's IP addressand submits this to Google's reCAPTCHA server which responds with success or failure and possibly an error message.

On the Client web page you typically embed the reCAPTCHA code directly in your form. The form in this example is minimal with a single text input field wrapped in a Form object that will post its data to the demo server.

The reCAPTCHA code has two parts. The first part, with the script tag loads the reCAPTCHA JavaScript from the url http://www.google.com/recaptcha/api/challenge with the k argument set to your Public Key

The second part, with the noscript tag loads an iframe containing code that will work if the user does not have JavaScript enabled.

These days the vast majority of users have JavaScript enabled. Either way, reCAPTCHA is adding two fields to your form with the names recaptcha_challenge_field and recaptcha_response_field. You will need to access the values for these parameters in your server script

The Server-side code on the Demo server is a simple Sinatra action that assembles and submits a POST request to the reCAPTCHA server at http://www.google.com/recaptcha/api/verify. For convenience I am using the Ruby library HTTParty from Jon Nunemaker to handle the submission and response.

There are four required parameters:

You can get the Remote IP address from request.ip in Sinatra. reCAPTCHA probably wants this in order to identify machines that are possibly trying to circumvent the process.

The HTTP response from reCAPTCHA contains two lines as its body. If the user entered the correct text then the response reads true followed by success. If not, the response is false followed by incorrect-captcha-sol if the text entered did not match what was displayed to the user, or captcha-timeout if the user waited too long before submitting the request. Your server code can then act upon that response. Here it just echoes it to the web page.

    
    if params['recaptcha_challenge_field']
      @recaptcha_response = HTTParty.post('http://www.google.com/recaptcha/api/verify',
                  :query => { :privatekey => 'YOUR_PRIVATE_KEY_GOES_HERE',
                              :challenge => params['recaptcha_challenge_field'],
                              :response  => params['recaptcha_response_field'],
                              :remoteip => request.ip
                            })
    else
      @recaptcha_response = nil
    end

    erb :tutorial_45_demo_1
    

reCAPTCHA is easy to set up and provides excellent protection against automated software accessing your site. But please don't over use the feature - it will quickly become annoying for your users. Use it when they create an account or perhaps when they submit a comment or other text that could be abused to send spam.


More Information

reCAPTCHA documentation

reCAPTCHA digitization


Code for this Tutorial


Share this tutorial



Comment on this Tutorial


comments powered by Disqus