Monday, 27 February 2012

Rolling your own A/B split testing framework for Google Analytics - Part 1

Why we Rolled our Own A/B Testing Framework

Any online startup out there knows the importance of A/B split testing their website pages so they can quickly and effectively make decisions about what pages, content, layouts, forms etc. are working to improve conversions versus those that are not.

There are many online services out there that help startup's make this process easier by allowing them to create and test landing page variations (for example) in minutes and then tie those variations to a conversion event so that understanding variation effectiveness becomes trivial.

One problem with this premise is what to do when your site starts to become more and more complex and you'd like to be able to create tests for many pages within your site structure or signup flow, and be able to test these variations in parallel.

This is the problem we were faced with a little while ago, and as I'm sure may people out there will attest to - sometimes its just best to roll your own and take full control over the A/B testing process.

Our Requirements

1) Easily add page test variations across multiple areas of the site concurrently.
2) Easily track where the source of the traffic came from.
3) Easily test page variations based on traffic source.
3) Easily use Google Analytics to associate variations with goal completions.
4) Accurately measure split-test results using Google Analytics reporting.

With that in mind we got to work and set about building a basic framework that would allow us to achieve this as quickly as possible (and it ended up taking us two days to do).

The Framework Design
Our framework for A/B testing pages on our websites starts with a front-end controller we created through which all resource requests are filtered.  The filter is activated through a .htaccess config in our web root:

<IfModule mod_rewrite.c>
  RewriteEngine on
  RewriteCond $1 !^(controller\.php|pages|scripts|include|images|robots\.txt)
  RewriteCond %{REQUEST_URI} !^.*(.css|.js|.gif|.png|.jpg|.jpeg)$
  RewriteRule ^(.*)$ controller.php/$1 [L]
</IfModule> 

The mod_rewrite conditions above ensure that the controller itself, or any resources in subdirectories (such as  javascripts, css, and images) are excluded from the controller feature and are served directly.

For example, a request such as http://www.brandfu.com/images/logo.png would not be sent through the controller for a rendering decision, but requests such as http://www.brandfu.com/index or http://www.brandfu.com/tour would be. The example URL http://www.brandfu.com/tour would then be mapped to a site source file in the pages directory called tour.php.

To take care of requirements 1,2 and 3 above, we tell the controller to look for page variations that contain the channel in the source file name as well as the page requested.

Here's an example:

Suppose we have a channel called 'b2b' which we use in an AdWords campaign that targets direct end customers, and we'd like to split-test the landing page, in this case 'index.php', the URL we would setup on our ad destination URL would be http:///www.brandfu/com/b2b-index

And on the file-system we would explicitly create the following files with varying content:

b2b-index-1.php
b2b-index-2.php

Once a matching request hits our controller, it then looks at the request URL and breaks it down into it's component parts in order to:
  1. Identify and store the source channel parameter in a session. 
    • If no channel is specified we default to 'none'.
  2. Combine the source parameter and page variation to figure out what to look for in a variation pool.
    • In this example the controller would be looking for a file count in the pages directory with a matching file glob pattern of 'b2b-index-?.php'.
  3. Display a random choice of one of the variation options and store the page-variation number in a session so that the website visitor will see the same page as before when navigation around the site.
    • In this case we'd randomly select between 1 and 2 and actually render out the page that contained the chosen variation number.
  4. Send the 'channel', 'page', and 'variation' to Google analytics for tracking later on.
This process can then be repeated for all areas of our site so that we can test different pricing pages, tour pages, and signup pages at the same time across multiple channels, and always maintain a consistent browsing experience for visitors who are navigating the site.

Degrading Gracefully

To avoid the potentially embarrassing consequences of mistakenly omitting or updating a component of the URL structure, such as a channel name, we had to build some graceful degradation capability into our controller.

Here's another example:

If the URL we published in an AdWords ad didn't use the 'b2b' channel in its construction but rather used a channel name of 'direct' and the full URL read as http:///www.brandfu/com/direct-index, we still wanted the index page variation to display correctly and have the data sent to Google.

In order to do this, the controller looks at the channel-page parameter combination, sets the session channel parameter to 'direct' and if it finds no explicit file matches, it degrades its search and looks only at the page parameter by changing its file matching pattern to *-index-?.php and then randomly chooses a variation number. 
In this case the actual file rendered would still be either b2b-index-1.php or b2b-index-2.php but we'll send Google Analytics a channel of 'direct', a page of 'index' and  variation of either 1, or 2.

Here is a simple flow diagram of how our A/B Testing controller functions:
The A/B Test Controller Decision Flow



In Part Two we'll go look more deeply into the controller and how we integrate all the tracking data with standard and custom variables for Google's Urchin Tracking Module (UTM) for later analysis in Google Analytics.

No comments:

Post a Comment