The ABCs Of A/B Testing
Testing should be at the core of your email marketing program. Not only does it help you understand the impact you’re making, but it gives you a much fuller understanding about your customers’ behavior and preferences. It not only tells you where you’ve been, but where you should (and shouldn’t) go with your campaigns. A/B testing is the […]
Testing should be at the core of your email marketing program. Not only does it help you understand the impact you’re making, but it gives you a much fuller understanding about your customers’ behavior and preferences. It not only tells you where you’ve been, but where you should (and shouldn’t) go with your campaigns.
A/B testing is the simplest, most straightforward testing method available. Most of you probably understand what A/B testing entails, but for those who don’t: A/B test is a process through which you provide different versions of an email to statistically significant groups of subscribers, and then measure their reactions to those versions in order to understand which is more effective at driving the behavior you prefer.
At minimum, you should be consistently performing A/B testing across your entire program. Each campaign can yield insights from an A/B test that can provide incremental lifts that build on each other to optimize your revenue from email.
What To A/B Test
Almost anything in an email can be A/B tested, but try to focus on the larger aspects to prevent getting caught up on minute details that won’t have a large impact. Try testing around the following aspects:
Ah, the age-old question, “When is the best time to send an email?” According to a recent Experian study, the answer is 8:00 pm to 12:00 am; however, according to a recent study at DEG (where I work) 8:00 pm was statistically not a great time to send emails. In short, as any marketer will tell you, “It depends.” Each brand’s subscriber list varies, so you should test and see how your subscribers respond.
For testing time of day, I recommend a full 50-50 split test. Send to 50% of your list at your usual time of day, and 50% at a different time. One current theory is that consumers are making more purchases in the evening, when they are engaged with their tablets or mobile devices while watching TV.
Many A/B tests are conducted with a three-way split. For example, 10% for Group A, 10% for Group B, and 80% for the Winning Group. In this scenario, you send to Group A and Group B, determine the winner after a specified period of time, and send the winning version to the remaining Winning Group. With time-of-day however, this is more difficult and becomes influenced by day-of-week, as well, because your test groups would receive the email on one day and the winning deployment would be sent on a different day.
Beyond timing for your promotional emails, A/B testing can be critical for determining when to send a triggered campaign — for example, determining the cadence of the emails in a win-back series, abandoned shopping cart series, or with post-purchase or other lifecycle triggers.
Creative tests involve more work, because they require the graphic design team to design more than one version and the development team to build more than one version. But, the results of a creative A/B can be very dramatic and impactful.
For example, you can test lifestyle imagery in one version and specific product images in another version. Another popular creative test is to have one one version of an email that is entirely image-based while another includes HTML text that can be viewed before downloading images.
If you are implementing an overall redesign, test the current layout or creative with the new proposed layout. This test, for example, could prove the case for moving to responsive design. But, creative testing can also be as simple as including a button versus text or adjusting the font size of the copy.
If you are able to offer discounts, then perhaps the most important testing centers on the offer. For example, do your customers respond most favorably to a percentage off, a dollar amount off, free shipping, a tiered discount, a discount limited to specific products, a time sensitive discount, a flash sale, a mystery offer, or some other kind of deal?
When testing offers, a few additional metrics should be considered in your evaluation. In addition to gross revenue, you should also consider Average Order Value (AOV), Units Per Transaction (UPT), and margin.
The simplest and most common item to A/B test is the subject line. If you hadn’t planned to test any of the aspects above in an email campaign, at minimum you can set up a subject line test.
For example, test including the offer in the subject line versus teasing the offer, or a longer subject line versus a shorter subject line. Subject lines are often given the least amount of thought in regard to an email, but they play a very key role. Even if you do not change anything else about your email program, you can improve your reach simply by focusing on better copywriting and testing for optimization.
Interpreting the results from A/B testing is sometimes the hardest part. If you’re lucky, a clear winner will emerge, and your decision will be easy. Often, however, the results for different key metrics can be conflicting.
First, determine the appropriate tagging to effectively track your campaigns outside of your email service provider. Next, determine your metrics for evaluation in advance. Some tests may only have one metric for evaluation — for example, the key measurement for a subject line test is open rate.
In determining the success of a creative A/B test, on the other hand, you’ll want to evaluate several metrics. My favorite key measurement of a creative test is click-to-open rate, but which metrics you look at will be based on the type of creative test you’re running.
For a creative test of image-only versus image and HTML text, your success metrics could be open rate (to determine if HTML text increased the number of subscribers who downloaded images), click-to-open rate, click-through rate and revenue.
For a creative test of a completely new layout (such as traditional desktop versus responsive design) your success metrics may include additional metrics such as read rate and click-through rate by device (desktop, iPad, iPhone, Android, etc.).
Then it comes to A/B testing around offers, your measurements will focus on both email engagement metrics as well as revenue metrics. Some thought leaders will argue the only true measurement of success is revenue. While I agree that conversion is the ultimate goal of the email, different aspects of A/B testing have clear correlations with specific email engagement metrics. I prefer to consider the engagement and revenue metrics together to make an informed decision.
And frequently, the result of a test forms the basis for a new test.
Direct mail has long included control groups, which are excluded from receiving mailings to determine the effectiveness of sending a direct mail piece. Email control groups can be more difficult to identify. For example, on a subject line test, what would your control be?
Creating a control for testing an offer is also difficult because including an offer will generally always outperform no offer. And at that point, you have essentially created two tests: offer versus no offer, and offer A versus offer B.
I once encountered a client who had excluded a portion of email subscribers from receiving any email at all, as part of a long-term study to determine the value of sending email. At the end, the test showed the tremendous value of email marketing as well as the loss in revenue from not sending to these subscribers. While it helped justify investments in the email program, it was determined that this test was more of a loss than a gain.
One final point in determining results of your A/B tests is statistical significance. Simply basing your test on quantity may not be accurate. If you do not have an analyst on your team to run the numbers, there are plenty of reliable calculators on the Web that can help you do this by evaluating the number of conversions generated from the control and test groups.
Aim for a confidence level of at least 90%, which means you have only a 10% chance of interpreting the result incorrectly. You should only record a final decision for tests that are deemed to be statistically significant. If in doubt, run the test again to validate the results!
Acting On Results
A/B testing is worthless if you do not implement what you learn. Create a spreadsheet where you record the tests performed, results, and confirmed decisions. I also recommend categorizing your tests to make it easy to search or filter by type. Even if you only send one test per week, at the end of a year, you would have 52 tests. By categorizing, you can easily search later when someone asks “What offers have we tested?” or “How do different thresholds for free shipping affect our total revenue?”
A/B testing isn’t the end of the story, of course. Once you’ve mastered A/B testing or are anxious for more than one key finding per test, we can begin to talk about multivariate testing. But for now, A/B will be more than enough to get your testing program off the ground.