Choosing Multivariate or A/B Testing
Optimizing your website is important. The best visitor acquisition campaign in the world doesn’t matter if consumers aren’t converting once they visit your website. Likewise, implementing every tweak that you think could optimize conversions doesn’t matter if you don’t know what’s working. That’s where testing comes in. This blog walks you through the fundamentals of multivariate and A/B testing.
A/B testing is a common way to optimize website performance. It generally consists of forming a hypothesis around what changes should be made, and then developing a Test version of your website with the changes implemented. You will then send an equal number of visitors to the Test and Control versions of your website. The control version is simply the unedited, current version of your website. Testing versions are often called “recipes,” and you’re not actually limited to only two recipes when A/B testing. You can test any number of recipes, assuming you have enough visitors to garner a statistically significant sample size. As your number of recipes increases, so will the necessary sample size to complete the test.
Academically speaking, A/B tests make only one change at a time since A/B testing looks at the dependent variable results and does not take the interactions of independent variables into account. In the real world, companies don’t often have time to test in this way, which is why it can be difficult for organizations to decide between A/B testing and multivariate testing.
Sometimes, based on your hypothesis, the testing methods can be combined. For instance, if you’re considering a major site redesign, it would be prudent to run an A/B test to ensure the redesign, as a whole, performs at least as well as your current site, and then use multivariate testing to optimize the redesign.
Multivariate testing is similar to A/B testing, but is slightly more complex. Multivariate testing investigates interactions of variables on the dependent variable.
Multivariate testing can be very complicated from a statistical standpoint because in real-life testing, many of your “independent” variables are actually collinear, meaning the independent variables are likely correlated. Collinearity leads to skewed results when attempting to deduce the effect of individual independent variables.
A common real-life example for an online retailer may be a customer’s demographics. Let’s say you’re a financial services provider trying to test how various demographics respond to your new, premium security campaign. You throw age into the mix because you think young adults will be less risk averse, and then you throw income into the mix because higher incomes have more to protect. Your results are likely skewed because there’s a correlation between age and income. You may overestimate the effect of one, while underestimating the effect of another. A skilled statistician will be able to help you control for this.
Multivariate Experimental Design
When designing a multivariate experiment, you will first have to choose if you want to utilize a Full-Factorial or Fractional-Factorial design method.
Factors refer to the elements you want to test – perhaps on your eCommerce site, you want to test the product photo, product detail copy and prominence of the reviews. If each of these factors had three potential choices, you would be looking at a total of 27 combinations (3x3x3 = 27).
If you chose a Full-Factorial design, you would test all 27 independently. The Fractional-Factorial design would only require that you test a fraction of the 27 total options, choosing combinations purposefully to ensure you get the most complete information.
There are trade-offs associated with each test. Full-Factorial offers the most reliable results as it tests every combination. However, dividing random samples into such a large number of tests means a longer test time to achieve statistically significant results, which is why so many organizations ultimately choose to deploy a Fractional-Factorial design for many of their experiments.
If you have done much research into the rabbit hole of Fractional-Factorial design in web analytics, you have probably come across the Taguchi method, also known as Robust Design. Genichi Taguchi was a Japanese statistician and engineer whose namesake statistical methods are widely credited with revolutionizing manufacturing. The Taguchi method is often described in Six Sigma teachings as a way to ensure rapid, efficient testing in the design phases. His methods focused on identifying what he described as “noise” factors, and determining which control factors could reduce the influence of the noise. This design technique, like some other Fractional-Factorial designs, uses orthogonal arrays. To keep the conversation in layman’s terms, the use of orthogonal arrays allows for each factor to be assessed independently.
There are a variety of considerations that go into determining which multivariate design method is right for your business objectives; example considerations include: website traffic, the number of variables being tested and the likely interaction between the control variables. Don’t simply choose a method because a web analytics firm touts it as the best for everything. Each business has a specific set of circumstances to work with: budget lines, traffic counts, project timelines, risk factors, etc. A good web analyst can explain the benefits of each method to you and help you weigh the pros and cons for your specific goals, objectives and situation.
Multivariate Analysis Techniques
If you have decided on multivariate analysis, you actually have quite a few techniques to consider. We will cover some of the most popular techniques here, again, attempting to keep the conversation at a managerial level.
Multiple Regression Analysis
This examines the effect of multiple independent variables on one dependent variable by determining their relationship. In many digital analytics use cases, the dependent variable is the KPI we’re trying to optimize, something like revenue per visit or conversion rate. The independent variables are all the moving parts that ultimately effect that KPI. Determining the relationship generally helps to generate insights and optimize business efforts.
The relationship is shown through coefficients. In the example below, the coefficient associated with gender is .45, while the coefficient associated with age is 2.1, and the coefficient associated with income is .9. These coefficients aren’t percentages, and they’re most useful when compared to one another. Here, we can understand that age is nearly five times more important than gender when determining revenue per visit.
Example: .45[gender] + 2.1[age] + .9[income] = revenue/visit
Using this data, the retailer could run additional statistical analysis to gather more information about the importance of age, answering questions such as, “Which age is most valuable?” or “At what age do these effects diminish?”
This type of multivariate analysis does not utilize a dependent variable. Rather, it looks at the underlying connections of a multitude of variables in order to group them.
For example, a home décor retailer that looks at thousands of products within categories such as “curtains,” “rugs,” “furniture,” etc. may benefit from factor analysis that looks at underlying sales factors such as color, brand and creative.
Cluster analysis and Factor analysis are often confused. A simple rule of thumb is that factor analysis looks at your product or service, while cluster analysis looks at your consumer. If you sell small business supplies, you may broadly know that your target is small business owners, but cluster analysis will allow you to see sub categories – for instance, office managers who are mainly concerned with ease of shopping and reordering, business owners who are mainly concerned with quality and accountants who are mainly concerned with price.
Finding these clusters may help you decide who to best target and how. Clusters should be large, reachable, measurable and unique.
Multidimensional scaling is similar to Factor analysis in that it tests the “distance” between brands and attributes. If you’re considering multidimensional scaling, you will likely want to analyze survey data or customer review data. For instance, a cosmetics retailer may be able to use its online reviews to discern how various beauty brands are viewed by its consumers. Multiple variables are condensed along simplified dimensions much like factor analysis.
The final result is often a perceptual map that charts brands along two dimensions. Although it’s statistically possible to utilize more dimensions, the visualization is tricky, and you’ll likely get more bang for your buck analyzing two dimensions at a time.
In the example shown to the left, which was created for illustrative purposes only, using absolutely no data, you can see that the two most important dimensions chosen to represent perceptions of lipstick brands were “creaminess” and “long-lasting wear.” Seeing the perceptions of consumers representing visually, benchmarking against competitors can be extremely telling, but you must be open to believing the results. Your scientists may be able to prove your lipstick is creamier, but if your customers don’t perceive that to be true, you have to either accept it and work around where you’re positioned, or rethink your marketing campaigns to close the gap between where you are in the consumer’s mind and where you want to be.
Consult with a statistician to ensure you don’t waste resources on an experimental design that won’t answer your business questions. Generally speaking, if want to explore the deeper interactions within your test recipe, design a multivariate experiment. However, if you’re looking to make hard and fast decisions, choose A/B testing.