/Docs

Comparing engines

A vs B locks the official stats engine at launch — that's what shows up in audit logs, exports, and the experiments-list summary. But on the results page itself, you can re-render the analysis under any engine, any time, without changing the official record.

What “Explore under” does

On every experiment's results page, the engine badge is followed by an Explore under dropdown listing all three engines (Bayesian, Frequentist, Sequential). The currently-official option is marked (official) and is selected by default.

Picking a different engine refetches the results from raw exposures and metric events, runs the analysis under the engine you picked, and re-renders the entire results surface. The engine badge updates to reflect what you're looking at, and a yellow Exploratory viewbanner appears below the header so it's never ambiguous which engine the numbers came from.

Click Reset to official on the banner — or pick the official engine in the dropdown — to go back to the locked-in view.

The official engine never moves

Switching the dropdown does not change anything in the database. The experiment's official engine, audit log, exports, alerts, and summary numbers all continue to reflect the engine that was set at launch. Explore-under is purely a viewing affordance.

When to use it

  • Cross-checking a result.A Bayesian “87% probability variant beats control” is reassuring; seeing the same data under Frequentist with p < 0.05 can confirm the result is robust to the choice of engine.
  • Talking to stakeholders.A statistically-curious stakeholder asks “but what would the p-value be?” The Frequentist view answers without re-running the experiment. A risk-averse stakeholder asks “is this safe to stop early?” The Sequential view answers that one.
  • Demonstrating engine differences. Pick an experiment where the methods might disagree (small effect, modest sample size) and walk through the three views side by side. Useful for onboarding and for justifying engine choices on future experiments.

Caveats and exploratory caveats

Explore-under is exactly that: exploratory. The recomputed numbers are mathematically valid for the engine you picked, but a few subtleties are worth keeping in mind:

  • Engine-specific defaults apply. When you explore under a different engine, the engine-specific configuration from your project settings applies — alpha defaults to 0.05unless your project sets otherwise. Per-experiment alpha and MCC overrides also flow through if they're set.
  • Sequential explored retrospectively is not the same as Sequential by design. The always-valid guarantee is mathematical and requires the analyst to commit to Sequential before the experiment starts. Re-rendering a Bayesian-launched experiment under Sequential is a useful comparison, but the “safe to peek” property is a property of the original analysis plan, not a re-labelling.
  • The summary numbers in the experiments list don't change.The lift-percentage and conversion totals shown on the experiments list are mirrored from the official engine's analysis. Exploratory views do not overwrite them.

Compare engines, side by side

Explore-under shows one engine at a time. The Compare engines button — also in the results page header, right next to the Explore-under dropdown — opens a panel that renders Bayesian, Frequentist, and Sequential results in three columns at once, computed on the same underlying data.

Each column speaks its engine's native vocabulary:

  • Bayesian — chance to beat control, plus the credible interval on the lift.
  • Frequentist — classical p-value, 95% confidence interval, and a significance tag. When the experiment has more than two variations, the column header notes which multiple-comparison correction is in effect (Bonferroni, Holm, Benjamini-Hochberg, or none).
  • Sequential— always-valid p-value, 95% confidence sequence, and a “safe to stop” or “not yet conclusive” label.

The column whose engine matches the experiment's official engine is marked (official), so it's always clear which view is the one of record.

Agreement is reassuring; disagreement is informative

When all three engines roughly agree, the result is robust to the choice of method. When they disagree, that's information too — usually about sample size or effect size relative to noise. Compare engines is the fastest way to check both.

Compare engines runs the analysis pipeline once, on a single fetch of the underlying raw data, and dispatches all three engines on it — so opening the panel does not triple your ClickHouse load compared to a normal results-page load. Like Explore-under, it never changes the official engine of the experiment.

Where to find it

Open any experiment's results page. The Explore under dropdown sits next to the engine badge in the header, with the Compare engines button right after it. Both are available for any experiment status — draft, running, paused, completed, archived — as long as you have viewReports permission for the org.