Question 1

How long should an A/B test run?

Accepted Answer

An A/B test should run long enough to collect the statistically required sample size, but also for at least 1-2 full business cycles (typically 1-2 weeks) to account for day-of-week and other cyclical traffic patterns. Running a test for less than 14 days risks missing important weekday, weekend, and business-cycle differences. If your calculator shows a duration of less than 14 days, consider running longer anyway.

Question 2

How much traffic do I need for an A/B test?

Accepted Answer

The traffic you need depends on your baseline conversion rate, the minimum effect you want to detect, and your desired confidence level. Lower conversion rates and smaller detectable effects both require more traffic. As a rule of thumb: detecting a 5% relative lift on a 3% baseline rate at 95% significance and 80% power requires roughly 50,000 visitors per variation. This calculator computes the exact number for your specific scenario.

Question 3

Should I test 50/50 or use a holdback?

Accepted Answer

A 50/50 split gives you the most statistical power and the shortest test duration. However, if you are testing a risky change, you may want to allocate less traffic to the variant (e.g., 90/10 or 80/20) to limit exposure. Keep in mind that reducing traffic allocation proportionally increases test duration. A 50/50 split is recommended for most tests unless there is a specific reason to limit risk.

Question 4

How many variations is too many?

Accepted Answer

Each additional variation increases the total sample size needed linearly. With 2 variations (control + 1 variant), you need 2x the per-variation sample. With 4 variations, you need 4x. More variations also increase the risk of false positives unless you apply multiple comparison corrections. For most teams, 2-3 total variations is the sweet spot. Beyond 4, consider running sequential tests instead.

Question 5

What if my test takes too long?

Accepted Answer

If your estimated duration is longer than 8 weeks, consider: (1) Increasing the minimum detectable effect — can you accept detecting only a 15% lift instead of 5%? (2) Increasing traffic allocation to the test. (3) Testing on a higher-traffic page or segment. (4) Reducing the number of variations. (5) Accepting a lower significance level (90% instead of 95%) for exploratory tests. Very long tests risk external factors (seasonality, product changes) contaminating results.

A/B Test Planner

Test Plan Summary

Recommendations

Methodology

Frequently Asked Questions

Related Calculators