This post is a part of Formstack’s Back to School Campaign. From August 20-September 21, we’ll be sharing all things relating to higher ed, from how to survive your first semester of college to what cloud computing tools you should use as an educator. When it comes to all things campus, we’ve got you covered. Check out our campaign page for more information and enter to win a free semester of text books!
Looking back at college, I had a lot of memorable classes. Statistics wasn’t one of them. I took STAT 350 my freshman year at Purdue University because, frankly, I had to. In fact, Statistics was my least favorite of mathematical subject. I don’t know if I had a bad teacher in high school or if I just couldn’t grasp the concept of probability (something about grabbing colored socks out of a drawer…). Either way, I was not looking forward to the course.
Luckily, I made it out of Statistics with a decent grade. And once I had my required math courses out of the way, I had more time for career-building courses, like “Mafia in the Movies” or “Wine Appreciation.” But as it turns out, Statistics wasn’t through with me. When I took on the role of Conversion Optimization Specialist at Formstack, I also took on a bit of Statistics. So, I decided to go back to the Stat 350 course description and see exactly what I learned there and what I’m using now. The similarities were surprising.
1. Data Analysis
Describing distributions & relationships (graphics, center and spread, regression and correlation, influence).
In class, I remember doing problem after problem where I analyzed charts and tried to infer what the chart was actually measuring. Sometimes the charts were clear, sometimes purposefully unclear. For conversion optimization, things are typically straightforward. Conversion Rate Optimization (CRO) charts are generated by plotting the CR (rate = conversions / visitors) of each variation in a linear fashion over time.
In the example above, you can tell the Red variation is winning because it has a higher CR. But not only is it winning, it has been leading the entire experiment. Longer sustained conversion rates equal higher probability a “winning” variation has a true higher conversion ability.
This graph shows the advantage of having multiple variations with shared values. If the rates continue on their paths, perhaps you can infer something about the Green and Red lines versus the Blue and Yellow lines. What do those pairs have in common? Maybe there’s an element in there that is pushing their CR higher or lower.
2. Data Production
Sampling design; Design of experiments.
When creating an A/B test, much consideration should be given to the sample’s (test group) make-up and experiment design. In this section I remember many word problems that described what a test group wanted to measure and what population or “samples” they had access to. In much the same way, each experiment you A/B test should be thought out to make sure you’re measuring the right goals and targeting the correct group.
Targeting (Sampling Design)
By default, most A/B tests target every visitor to your site, but sometimes this isn’t the best solution. For example, some websites that get large volumes of traffic have sizable international segments. Or maybe a big percentage comes from mobile devices. But from where do your conversions come? Do visitors from Turkey buy your product? Do mobile visitors sign up for trials at the same rate? If you are basing your tests on a metric that a large portion of your “sample” doesn’t utilize, maybe you shouldn’t be sampling them in the first place.
Designing the Experiment
A/B testing software typically charges rates based on visitor traffic. Also, tests take time to become statistically significant, and your time is valuable. Try to combine many variations into one test to maximize your learning. I try to keep tests within 4 to 8 variations, depending on traffic and how long it would take to reach statistical significance. Have a lot of elements to test? Try doing a multivariate test to learn which are the best.
3. Probability distributions and simulation
The idea of a sampling distribution; The idea of probability and probability distributions; Simulating discrete and continuous distributions; Sampling distribution of sample means (law of large numbers, central limit theorem).
Admittedly, I don’t remember most of the stuff is under this section. If you like to learn more, Wikipedia has riveting explanations on sampling distribution and central limit theorem. But one thing did jump out at me, and that was the Law of Large Numbers.
As the wikipedia article states, the more experiments you run, the average of the results you get should be close an “expected value” for future experiments (in this case, an experiment is 1 visitor seeing one test variation). So according to the Law, running an A/B test longer delivers more reliable results. But you can’t run a test indefinitely, so when do you call it? Well, it depends.
Some A/B testing software recommends that you wait until at least 100 conversions per variation. But with high traffic sites, this can be done in day or two. If the experiment was run on Monday and Tuesday, how do you know the same winner will perform well on Friday and Saturday? Typically, I like to run an experiment over a week minimum. Running two weeks will give you correlating data to compare day over day.
In the end, I may not use the exact terminology that I learned in school (I’ve yet to work “standard deviation” or “regression coefficients” into a CRO conversation–or any conversation for that matter), but the fact still remains that statistics are apart of my everyday professional life at Formstack. And for a company that prides itself on data collection, I think I’m okay with that.
Image cred: Nieman Journalism Lab