Skip to main content
Free Calculator

A/B Test Sample Size Calculator

Calculate how many visitors you need per variation to detect a meaningful difference in your A/B test with statistical confidence.

Your current conversion rate

Relative % lift you want to detect

Confidence level (typically 95%)

Probability of detecting a real effect (typically 80%)

Enter to estimate test duration

Results

Per Variation

31,234

visitors

Total Sample

62,468

visitors (2 variations)

Est. Duration

Enter daily traffic

Methodology

This calculator uses the two-proportion z-test sample size formula, which is the standard approach for frequentist A/B testing. The core formula is: n = ( Z_alpha/2 * sqrt(2 * p_bar * (1 - p_bar)) + Z_beta * sqrt(p1*(1-p1) + p2*(1-p2)) )^2 / (p2 - p1)^2 Where: - n is the required sample size per variation - Z_alpha/2 is the critical value for the chosen significance level (two-tailed) - Z_beta is the critical value for the desired power - p1 is the baseline conversion rate - p2 is the expected conversion rate (baseline + MDE) - p_bar is the pooled proportion (p1 + p2) / 2 This formula assumes equal sample sizes in control and variant groups, a two-tailed test (detecting both positive and negative effects), and independent observations across variations. The result is rounded up to the nearest whole number since you cannot test fractional visitors.

Frequently Asked Questions

How is A/B test sample size calculated?
Sample size is derived from a two-proportion z-test formula. It takes into account your baseline conversion rate, the minimum detectable effect (smallest lift you want to detect), your chosen significance level (typically 95%), and statistical power (typically 80%). The formula ensures enough data is collected to reliably distinguish a real effect from random noise.
What is the minimum detectable effect (MDE)?
The minimum detectable effect is the smallest improvement in conversion rate you want your test to be able to detect. A smaller MDE requires more traffic. For example, detecting a 1% relative lift needs far more visitors than detecting a 10% lift. Choose an MDE that represents a meaningful business impact.
Why does a lower MDE require more samples?
Smaller effects are harder to distinguish from random variation. To confidently say a tiny improvement is real (and not noise), you need more observations. Think of it like trying to hear a whisper in a noisy room — you need more time (data) to be sure of what you heard.
What significance level should I use?
A 95% significance level (alpha = 0.05) is the industry standard. This means there is a 5% chance of declaring a winner when there is no real difference (a false positive). Some teams use 90% for exploratory tests or 99% for high-stakes changes.
How do I estimate test duration from sample size?
Divide the total required sample size (all variations combined) by your average daily traffic allocated to the test. For example, if you need 20,000 visitors per variation with 2 variations and get 5,000 visitors per day, the test would take 40,000 / 5,000 = 8 days. For trustworthy decisions, still run for at least 2 full weeks to capture weekly traffic patterns and business-cycle effects.

Related Calculators

Updated for 2026. Built by GrowthLayer.