Choosing a Stats Engine
A vs B supports three statistical engines: Bayesian, Frequentist, and Sequential. They answer the same underlying question — is variation B better than Control?— using different math and different guarantees. This page helps you pick the right one.
TL;DR
- Default to Bayesian.It's the easiest to read, forgiving of peeks, and works well for most web experiments.
- Pick Frequentist if your team speaks in p-values and confidence intervals, or if you need to match a fixed external standard.
- Pick Sequential if you want to peek freely and stop early without inflating false-positive rates.
Bayesian
Reports the probability that a variation beats Control, plus a credible intervalaround the estimated lift. No p-values, no significance thresholds to wrestle with — just a number you can act on.
When to pick:you want intuitive, directly readable results. You want to be able to glance at the dashboard mid-experiment without invalidating the math. You don't have a hard regulatory requirement for p-values.
Frequentist
Reports a p-value and a confidence interval, with an explicit significance flag at the alpha level you set (default 0.05). The classical approach taught in statistics textbooks.
When to pick:your team or stakeholders expect p-values. You need to replicate a specific textbook analysis. You're willing to commit to a pre-declared sample size and wait until it's reached.
Frequentist tests are only valid if you commit to a sample size up front. Checking your p-value every day and stopping the moment it drops below 0.05 inflates your real false-positive rate from 5% to as much as 20%. Use the sample-size calculator below to plan up front, or pick the Sequential engine if you need to peek safely.
Sequential
Reports an always-validp-value and confidence sequence — safe to look at any time, safe to stop early when the evidence is in. Uses asymptotic always-valid confidence sequences (AsympCS), the method Netflix ships in production.
When to pick:you want the rigour of Frequentist p-values, but the flexibility to peek and stop early. You're willing to trade slightly wider intervals (and a modest sample-size inflation) for that flexibility.
Sample-Size Calculator
Every project has a sample-size calculator at /projects/{id}/sample-size-calculator (also reachable from Settings → Analysis). The calculator has three modes, each engine-aware:
Fixed-horizon
Given your baseline conversion rate, minimum detectable effect (MDE), alpha, power, and daily traffic, the calculator returns the required sample size per variation and the estimated duration in days. This is the classic planning tool.
Bayesian mode asks for a target probability threshold(e.g. 0.95) instead of alpha, and accepts an optional informative prior to reduce required sample. Sequential mode inflates the required sample per the AsympCS framework — typically ~1.9× the Frequentist size at α = 0.05.
Power calculator
Given a sample size per variation, baseline rate, and MDE, the calculator returns the achieved statistical power— the probability of detecting the effect if it's real. Useful for interpreting historical experiments or understanding the limits of small experiments.
Duration estimator
Given daily traffic and the same planning inputs as fixed-horizon, the calculator returns the estimated number of days to reach the required sample size. Useful when traffic is the binding constraint.
When Sequential is selected, the calculator reports a planning estimate based on the asymptotic AsympCS bound. The actual experiment may stop earlier (or later) depending on when the data crosses the always-valid boundary. Treat the numbers as a conservative upper bound on required sample size.
Binary metrics only (V1)
The calculator currently supports binary (conversion-rate) metrics. Support for continuous metrics (revenue, session duration, etc.) is planned for a future release.