Martech: Martech is Marketing Logo
  • Topics
    Transformation
    Operations
    Data
    Experience
    Performance
    Management
    Special Reports
    All Topics
  • Conference
  • Webinars
  • Intelligence Reports
  • White Papers
  • What is MarTech
    Mission
    Team
    Newsletter
    Search Engine Land
    Third Door Media

Processing...Please wait.

MarTech » Data » Watch Out For False Positives — 3 Ways To Get Better At Testing

Watch Out For False Positives — 3 Ways To Get Better At Testing

It's easier than ever for marketers to dive into A/B and multivariate testing, but columnist Benny Blum argues that they need to know how to design a proper test first.

Benny Blum on October 28, 2014 at 9:00 am

lab-test-experiment-ss-1920

Everyone is testing — and you should be testing, too. If you’re not leveraging your website, CRM, and/or sales data to test and improve your business in some capacity, you’re leaving money on the table.

But, what are you testing? And do you (or should you) trust the results?

Testing software can enable A/B and multivariate testing with ease. Non-technical marketers can now quickly implement complex tests and systematically “prove” positive or negative results within a nicely designed UI.

However, one of the biggest issues keeping non-statistical results-driven marketers from implementing and interpreting tests is that they often don’t know how to design a proper test.

In this post, I’m going to detail three concepts which, if implemented, can help ensure any test you design is well-thought-out and more likely to deliver true results.

1. Design Of Experiments (DOE)

A Design of Experiments is a form of applied statistics used for planning, executing, and analyzing one or a series of controlled tests to understand the influence of one or more signals in a complex environment.

RA Fisher pioneered DOE back in the 1920s and 1930s and formally introduced, among many others, the following concepts:

  • Testing against a control (A/B testing)
  • Random assignment of participants between test(s) and control groups
  • Repeat testing to ensure accuracy and consistency of result

A well-designed and implemented experiment increases the likelihood of variance detection (good results) and reduces the likelihood of false positives or negatives. And one of the single most components of a well-designed experiment is a large sample size.

2. Statistical Power

A small sample increases the likelihood of a false positive.

Consider the null hypothesis: dogs are bigger than cats. If I use a sample of one dog and one cat – for example, a Havanese and a Lion – I would conclude that my hypothesis is incorrect and that cats are bigger than dogs.

But, if I used a larger sample size with a wide variety of cats and dogs, the distribution of sizes would normalize, and I’d conclude that, on average, dogs are bigger than cats. Not surprisingly, one of the most common flaws in a test is having a sample that is too small.

Fortunately, there’s a test to figure out if your sample is big enough: Statistical Power is the probability that a test will register a variance from a control. The bigger the sample size, the bigger the power.

There’s some serious math behind Statistical Power, but here’s a good rule of thumb: if you think you’re test is done, test a bit longer.

Unfortunately, most testing software charges by the number of impressions monitored in a test. This naturally disincentives users to run longer tests as COGs to execute the test rise as the duration of the test extends.

If you are operating on a slim budget and need results quickly, try running an A/A test in parallel with an A/B test. If the A/A test generates the same or similar “positive result” you can assume the high likelihood of a false positive.

3. Regression To The Mean

Imagine an experiment where we ask ten people to flip a coin a hundred times and guess the result for each flip.

We would expect an evenly distributed set of results with an average score of 50 correct and 50 incorrect. We declare the participants with the top 10 scores in the experiment to be the winners and ask them to perform the experiment again.

Chances are their results in the second experiment will, again, be evenly distributed with an average of 50 correct and 50 incorrect. Did the winners of the first round suddenly get worse at guessing?

No. They were outliers in the first round and when challenged again they naturally regressed toward the average score. This phenomenon is very apparent in online tests.

More often than not, a test showcases a strong initial result due to a novelty effect rather than a better user experience. If you let the test extend a bit longer, chances are you’ll see the results regress to control.

Conclusion

User behavior is difficult to change and amazing results in a short period of time are more often than not false positives.

This is not indented to undermine the novelty effect of making a change – constantly switching things up can make consumers pay more attention. That said, it takes a lot of data to make a test statistically significant, so chances are you’re working with an insignificant dataset.

If you embrace that reality then you can spend a little more time to strategically design your experiments to maximize the impact of your hypothesis validation and testing.


Opinions expressed in this article are those of the guest author and not necessarily MarTech. Staff authors are listed here.


New on MarTech

    Marketing operations talent is suffering burnout and turnover

    Antitrust bill could force Google, Facebook and Amazon to shutter parts of their ad businesses

    Unveiling our first MarTech Intelligence Report on email marketing platforms

    How product analytics can unite marketing and product teams to boost customer lifetime value

    Create a B2B GTM strategy that buyers, execs and revenue teams love

About The Author

Benny Blum
Benny Blum is the Vice President of Performance Marketing & Analytics at sellpoints, the leading online sales orchestration platform, and is based in Emeryville, CA.

Related Topics

DataPerformance Marketing

Get the daily newsletter digital marketers rely on.

Processing...Please wait.

See terms.

ATTEND OUR EVENTS

June 7, 2022: Master Classes

September 28-29, 2022: Fall

Start Discovering Now: Spring

Learn More About Our MarTech Events

June 14-15, 2022: SMX Advanced (virtual)

November 14-15, 2022: SMX Next (virtual)

March 8-9, 2022: Master Classes (virtual)

Learn More About Our SMX Events

Webinars

Take a Crawl, Walk, Run Approach to Multi-Channel ABM

Content Comes First: Transform Your Operations With DAM

Dominate Your Competition with Google Auction Insights and Search Intelligence

See More Webinars

Intelligence Reports

Enterprise SEO Platforms: A Marketer’s Guide

Enterprise Identity Resolution Platforms

Email Marketing Platforms: A Marketer’s Guide

Enterprise Sales Enablement Platforms: A Marketer’s Guide

Enterprise Digital Experience Platforms: A Marketer’s Guide

Enterprise Call Analytics Platforms: A Marketer’s Guide

See More Intelligence Reports

White Papers

Reputation Management For Healthcare Organizations

Unlock the App Marketing Potential of QR Codes

Realising the power of virtual events for demand generation

The Progressive Marketer’s Ultimate Events Strategy 2022 Worksheet

CMO Guide: How to Plan Smart and Pivot Fast

See More Whitepapers

Receive daily marketing news & analysis.

Processing...Please wait.

Topics

  • Transformation
  • Operations
  • Data
  • Experience
  • Performance
  • Management
  • All Topics
  • Home

Our Events

  • MarTech
  • Search Marketing Expo - SMX

About

  • What is MarTech
  • Contact
  • Privacy
  • Marketing Opportunities
  • Staff

Follow Us

  • Facebook
  • Twitter
  • LinkedIn
  • Newsletters
  • RSS

© 2022 Third Door Media, Inc. All rights reserved.