Skip to main content
Free Calculator

A/B Test Planner

Plan your experiment from start to finish — calculate sample size, estimate duration, and project revenue impact before you launch.

Your current conversion rate

The relative improvement you expect to detect

Average daily visitors to the test page

Percentage of traffic included in the test

Including control (e.g. 2 = control + 1 variant)

Confidence level (typically 95%)

Probability of detecting a real effect (typically 80%)

Enter to see revenue projections

Test Plan Summary

Per Variation

31,234

visitors

Total Sample

62,468

visitors (2 variations)

Est. Duration

7 days

~1 week

Run at least 2 weeks

Daily Test Traffic

10,000

visitors/day in the test

Minimum Detectable Effect

10.0% relative

0.50% absolute

Recommendations

  • Consider running for at least 2 full weeks to capture weekly traffic patterns.
  • You can detect a minimum 10.0% relative lift (0.50% absolute) with this setup.

Methodology

This planner combines several statistical and practical calculations to give you a complete test plan: 1. Sample Size: Uses the standard two-proportion z-test formula to calculate the number of visitors needed per variation. This accounts for your baseline conversion rate, expected lift, significance level, and statistical power. 2. Duration Estimation: Divides the total required sample by your daily test traffic (daily visitors × traffic allocation percentage). The result is rounded up to whole days, and we also show the equivalent in weeks. 3. Revenue Projection: If you provide an average order value (AOV), we estimate the monthly and annual incremental revenue if the test wins at the expected lift. This uses: incremental revenue = (new conversions - baseline conversions) × AOV, where new conversions reflect the expected lift applied to your baseline rate over 30 days of traffic. 4. Smart Recommendations: Based on your inputs, we generate contextual advice. For example, if the test would run less than 2 weeks, we recommend extending to capture weekly patterns. If it would run more than 8 weeks, we suggest ways to shorten it. All calculations assume a two-tailed test (detecting both positive and negative effects), equal traffic split among variations, and independent observations. The sample size formula is: n = ( Z_alpha/2 × sqrt(2 × p_bar × (1 - p_bar)) + Z_beta × sqrt(p1×(1-p1) + p2×(1-p2)) )^2 / (p2 - p1)^2 Where p1 is the baseline rate and p2 is the expected rate after applying the lift.

Frequently Asked Questions

How long should an A/B test run?
An A/B test should run long enough to collect the statistically required sample size, but also for at least 1-2 full business cycles (typically 1-2 weeks) to account for day-of-week and other cyclical traffic patterns. Running a test for less than 14 days risks missing important weekday, weekend, and business-cycle differences. If your calculator shows a duration of less than 14 days, consider running longer anyway.
How much traffic do I need for an A/B test?
The traffic you need depends on your baseline conversion rate, the minimum effect you want to detect, and your desired confidence level. Lower conversion rates and smaller detectable effects both require more traffic. As a rule of thumb: detecting a 5% relative lift on a 3% baseline rate at 95% significance and 80% power requires roughly 50,000 visitors per variation. This calculator computes the exact number for your specific scenario.
Should I test 50/50 or use a holdback?
A 50/50 split gives you the most statistical power and the shortest test duration. However, if you are testing a risky change, you may want to allocate less traffic to the variant (e.g., 90/10 or 80/20) to limit exposure. Keep in mind that reducing traffic allocation proportionally increases test duration. A 50/50 split is recommended for most tests unless there is a specific reason to limit risk.
How many variations is too many?
Each additional variation increases the total sample size needed linearly. With 2 variations (control + 1 variant), you need 2x the per-variation sample. With 4 variations, you need 4x. More variations also increase the risk of false positives unless you apply multiple comparison corrections. For most teams, 2-3 total variations is the sweet spot. Beyond 4, consider running sequential tests instead.
What if my test takes too long?
If your estimated duration is longer than 8 weeks, consider: (1) Increasing the minimum detectable effect — can you accept detecting only a 15% lift instead of 5%? (2) Increasing traffic allocation to the test. (3) Testing on a higher-traffic page or segment. (4) Reducing the number of variations. (5) Accepting a lower significance level (90% instead of 95%) for exploratory tests. Very long tests risk external factors (seasonality, product changes) contaminating results.

Related Calculators

Updated for 2026. Built by GrowthLayer.