Original Research

A/B Test Duration Guide — How Long to Run Tests by Traffic Volume

Pre-calculated test durations for 25 common traffic and conversion rate scenarios. All calculations use 80% statistical power and 95% confidence level with two-tailed tests.

By Michael Lip · Updated April 2026

Methodology

Sample sizes were computed using the standard two-proportion Z-test power formula: n = (Z_alpha/2 + Z_beta)^2 * (p1*(1-p1) + p2*(1-p2)) / delta^2, where Z_alpha/2 = 1.96 (95% confidence), Z_beta = 0.84 (80% power), p1 = baseline conversion rate, p2 = p1 * (1 + MDE), and delta = p2 - p1. Duration = ceil(2 * n / daily_visitors). All results assume equal traffic split between control and variant, two-tailed test, and no multiple comparison adjustment.

Sample Size Formula (per variant):

n = (Z_alpha/2 + Z_beta)^2 * [p1(1-p1) + p2(1-p2)] / (p2 - p1)^2

Where:
Z_alpha/2 = 1.96 (for alpha = 0.05, two-tailed)
Z_beta = 0.84 (for power = 0.80)
p1 = baseline conversion rate
p2 = p1 * (1 + MDE)

Duration: days = ceil(2 * n / daily_visitors)

Duration Table — 10% Minimum Detectable Effect

Daily Visitors Baseline CR MDE Sample/Variant Total Sample Duration (Days) Minimum Duration
5002%10%175,816351,632704704 days
5003%10%115,554231,108463463 days
5005%10%67,778135,556272272 days
50010%10%32,20464,408129129 days
50015%10%20,24440,4888181 days
1,0002%10%175,816351,632352352 days
1,0003%10%115,554231,108232232 days
1,0005%10%67,778135,556136136 days
1,00010%10%32,20464,4086565 days
1,00015%10%20,24440,4884141 days
5,0002%10%175,816351,6327171 days
5,0003%10%115,554231,1084747 days
5,0005%10%67,778135,5562828 days
5,00010%10%32,20464,4081314 days*
5,00015%10%20,24440,488914 days*
10,0002%10%175,816351,6323636 days
10,0003%10%115,554231,1082424 days
10,0005%10%67,778135,5561414 days
10,00010%10%32,20464,40877 days
10,00015%10%20,24440,48857 days*
50,0002%10%175,816351,63288 days
50,0003%10%115,554231,10857 days*
50,0005%10%67,778135,55637 days*
50,00010%10%32,20464,40827 days*
50,00015%10%20,24440,48817 days*

* Minimum 7-day duration enforced to capture day-of-week effects regardless of sample size.

Duration Table — 20% Minimum Detectable Effect

Daily Visitors Baseline CR MDE Sample/Variant Total Sample Duration (Days)
5003%20%28,38856,776114
5005%20%16,57433,14867
50010%20%7,74815,49631
1,0003%20%28,38856,77657
1,0005%20%16,57433,14834
1,00010%20%7,74815,49616
5,0003%20%28,38856,77612
5,0005%20%16,57433,1487
5,00010%20%7,74815,4964
10,0005%20%16,57433,1484
10,00010%20%7,74815,4962

Key Insights

Low-traffic sites face a painful reality. A site with 500 daily visitors and a 2% conversion rate needs over 700 days to detect a 10% relative lift. At this traffic level, only large effects (20%+ MDE) are practically testable, which means testing major redesigns rather than copy tweaks.

Conversion rate matters as much as traffic. Higher baseline conversion rates require fewer samples because there is less variance in the binomial distribution. A 15% CR site needs 3.4x fewer visitors per variant than a 3% CR site to detect the same relative effect.

The 7-day minimum is non-negotiable. Even high-traffic sites that reach statistical sample size in 1-2 days must run for at least 7 days. User behavior varies systematically by day of week — Monday shoppers behave differently from Saturday shoppers. Running less than a full week introduces cyclical bias.

Halving the MDE quadruples the sample. Moving from detecting a 20% lift to a 10% lift requires approximately 4x the sample size. This is the square relationship in the denominator of the sample size formula: (p2-p1)^2.

Frequently Asked Questions

How long should I run an A/B test?

The duration depends on your daily traffic, baseline conversion rate, and the minimum effect size you want to detect. At 80% power and 95% confidence, a site with 1,000 daily visitors and a 5% conversion rate needs approximately 136 days to detect a 10% relative improvement. Always run for at least 7 days to cover a full business cycle regardless of how quickly you reach sample size.

What is the formula for A/B test sample size?

The standard formula is: n = (Z_alpha/2 + Z_beta)^2 * (p1*(1-p1) + p2*(1-p2)) / (p2 - p1)^2. Here, Z_alpha/2 = 1.96 for 95% confidence, Z_beta = 0.84 for 80% power, p1 is your baseline conversion rate, and p2 is the expected conversion rate after the change. This gives the sample size per variant — multiply by 2 for total.

Why should I not stop an A/B test early?

Stopping early when you see a "significant" result inflates your false positive rate from the intended 5% to as high as 20-30%. This is the peeking problem. The p-value is only valid at the pre-determined sample size. If you need to monitor continuously, use sequential testing methods like SPRT or alpha-spending functions that adjust for multiple looks.

What is minimum detectable effect (MDE) in A/B testing?

MDE is the smallest relative change in conversion rate your test is designed to detect. A 10% MDE on a 5% baseline means detecting a change from 5.0% to 5.5%. Smaller MDEs require exponentially more traffic — halving the MDE roughly quadruples the required sample size due to the squared term in the denominator.

How does traffic volume affect A/B test duration?

Traffic volume is inversely proportional to test duration. A site with 10,000 daily visitors can run the same test 10x faster than one with 1,000 daily visitors. The required total sample size remains constant — higher traffic simply fills it faster. Use the tables above to find your specific scenario.