A/B Testing
Accelerate Experimentation Adoption with Protocols and Property Analysis
Learn more
In the dynamic landscape of data-driven decision-making, experimentation stands as a cornerstone for validating hypotheses and optimizing product performance. While traditional A/B testing has long been the go-to method, sophisticated data leaders recognize the limitations it imposes, especially as products grow in complexity and user interactions become more intertwined. Enter Clustered Experiments—a powerful approach designed to enhance your experimentation strategy and yield more reliable insights.
At its core, a Clustered Experiment deviates from the standard A/B test by randomizing groups (or clusters) of analysis units rather than individual units. This methodology is particularly beneficial in scenarios where individual randomization leads to interference effects, compromising the integrity of the experiment and the consistency of user experiences.
Standard A/B Test: Randomizes individual users into different variants and analyzes metrics on a per-user basis. For instance, splitting users to test a new feature and measuring the conversion rate per user.
Clustered Experiment: Randomizes groups of users (clusters) such as companies, geographical regions, or user segments, and analyzes metrics both at the cluster level and at the user level. This approach mitigates interference within clusters and maintains consistent experiences.
Some examples where a Clustered Experiment is more appropriate than a traditional A/B test include:
One of the primary challenges with Clustered Experiments is the lack of independence among observations within the same cluster. Traditional statistical methods assume independence, leading to underestimated variances and overconfident results when applied naively to clustered data.
Applying the delta method to the ratio of cluster-aggregated metrics and the cluster size offers a robust solution. For details, please see Deng et al. and Chapter 18 of the authoritative text Trustworthy Online Controlled Experiments. By expressing complex metrics as ratios of simple metrics normalized at the cluster level, we can inherently account for the clustered structure of the data. This gives results that are mathematically equivalent to the common Cluster Robust Standard Errors (CRSE) approach, but is far more scalable from a computation perspective.
Key Advantages:
1. Measuring Average Order Value (AOV) with User-Level Randomization
Scenario: A business aims to assess the impact of a new pricing strategy on AOV, where AOV is calculated per order. Users can place multiple orders, meaning the randomization unit (user) varies from the analysis unit (order).
Challenge: Randomizing at the order level could expose a single user to multiple variants, leading to inconsistent experiences and behavioral biases that persist across orders.
Solution: Randomize at the user level while analyzing AOV at the order level. By treating each user as a cluster of orders, you ensure that all orders from a single user adhere to the same treatment, preserving consistency.
2. User-Level Conversion Rate in Company-Randomized Experiments
Scenario: A SaaS company offers a new feature and wants to measure its effect on user conversion rates. Each company consists of many users, and randomizing at the user level could lead to cross-user contamination.
Challenge: If individual users within the same company are randomized, treated users might influence control users, skewing the conversion metrics.
Solution: Randomize at the company level and analyze the conversion rate at the user level.
Eppo simplifies the planning, launching, and analysis of clustered experiments. Using Eppo’s SDK, you can easily pass in your cluster identifier as the primary ID and the analysis unit as an attribute. For example, in the B2B scenario mentioned above:
This ensures that all users within the same company experience the same variant, while Eppo tracks exactly which users were exposed to the new feature and when.
When configuring the analysis, Eppo allows you to evaluate both company-level and user-level metrics seamlessly. The user experience remains unchanged, but behind the scenes, Eppo applies cluster-robust statistics to deliver reliable insights.
Clustered Experiments represent a sophisticated evolution in the realm of controlled experimentation, addressing the nuanced challenges of interference and multi-level data structures. By leveraging clustered experiments, experimentation leaders can conduct more accurate and reliable experiments, ensuring that insights derived are both actionable and trustworthy. Add it as a powerful tool in your experimentation arsenal.