Skip to main content
Free Calculator

A/B Test Significance Calculator

Check whether your A/B test results are statistically significant using a two-proportion z-test. Enter your control and variant data to find out.

Control (A)

Conversion rate: 3.50%

Variant (B)

Conversion rate: 4.20%

Results

Significant

p-value

0.0101

z-score

2.573

Confidence

98.99%

Observed Lift

+20.00%

Methodology

This calculator uses the two-proportion z-test, which is the standard frequentist approach for comparing two conversion rates. The test statistic is: z = (p2 - p1) / sqrt( p_pool * (1 - p_pool) * (1/n1 + 1/n2) ) Where: - p1 = control conversions / control visitors - p2 = variant conversions / variant visitors - p_pool = total conversions / total visitors (pooled proportion) - n1 = control visitors - n2 = variant visitors The p-value is calculated as 2 * (1 - CDF(|z|)) for a two-tailed test, where CDF is the cumulative distribution function of the standard normal distribution. Confidence level is simply 1 - p-value, expressed as a percentage. Observed lift is calculated as ((p2 - p1) / p1) * 100 and represents the relative percentage change in conversion rate. This test assumes independent observations, sufficiently large sample sizes (np > 5 in each group), and that traffic was randomly assigned between control and variant.

Frequently Asked Questions

What does statistical significance mean in A/B testing?
Statistical significance tells you whether the difference between your control and variant is likely real or just due to random chance. A result is typically considered significant when the p-value is below 0.05 (i.e., confidence is above 95%). This means there is less than a 5% probability that the observed difference happened by chance alone.
What is a p-value?
A p-value is the probability of observing a difference at least as large as what you measured, assuming there is actually no real difference between control and variant. A small p-value (< 0.05) suggests the result is unlikely due to chance. It does not tell you the magnitude of the effect or its business impact.
What is a z-score?
The z-score measures how many standard deviations the observed difference is from zero (no effect). A higher absolute z-score means the observed difference is further from what you would expect under the null hypothesis. Z-scores above 1.96 or below -1.96 correspond to p < 0.05 in a two-tailed test.
Should I use a one-tailed or two-tailed test?
This calculator uses a two-tailed test, which is the recommended default. A two-tailed test checks for differences in either direction (the variant could be better OR worse). Use a one-tailed test only when you have a strong prior reason to believe the effect can only go in one direction, and you are okay with missing effects in the other direction.
My test is significant. Should I ship the variant?
Statistical significance is necessary but not sufficient for a ship decision. Also consider: the practical significance (is the lift large enough to matter?), how long the test ran (ideally at least 2 full weeks to capture weekly patterns and business-cycle effects), whether there are segment-level concerns, and the potential downside risk. Combine statistical evidence with business judgment.

Related Calculators

Updated for 2026. Built by GrowthLayer.