NelworksNelworks
Probability

A/B Test Trial Simulation

Learn the optimal runs of trials to conduct before closing an experiment, without doubting False Positives or False Negatives.

Start simulating an A/B test

A/B Test Statistical Significance Calculator

Control Group (A)

True conversion rate for control group
Standard deviation as % of base effect (multiplicative)

Treatment Group (B)

True conversion rate for treatment group
Standard deviation as % of target effect (multiplicative)

Doubt Adjustment

Conservative thresholds: p < 0.01 for first 10 trials, gradually relaxing to p < 0.05

Monte Carlo Distribution: Trials to Significance

Key Statistics

Control (A): 20% ± 4.0%

Treatment (B): 30% ± 6.0%

True Improvement: 50.0%

Mean Trials: 171

Median Trials: 128

Range: 16 - 500

Simulations: 100


What Is This About?

This simulator shows how real A/B tests progress over time, with each new participant adding noise and uncertainty. Unlike static calculations, this demonstrates the actual trial experience where p-values start at 1 (no evidence) and gradually decrease as evidence accumulates.

The Challenge: Sequential Testing

Traditional A/B testing has problems:

  • Over-recruitment: Recruiting too many participants, wasting resources
  • Early Stopping: Stopping too early leads to false positives from lucky observations
  • Variance Impact: High variance makes it harder to detect true effects
  • Unknown True Effects: You only observe noisy measurements, not the true underlying difference

This simulator helps you understand how many trials you actually need, accounting for variance and sequential testing.

Who Is This For?

This simulator is designed for:

  • Product Managers: Planning A/B tests and understanding trial requirements
  • Data Scientists: Understanding sequential testing and statistical significance
  • Growth Marketers: Optimizing experiment budgets and avoiding false positives
  • Researchers: Learning about sequential analysis and p-value evolution
  • Anyone Running Experiments: Wanting to understand how A/B tests actually work in practice

Guide


Chart Interpretation

  • X axis: Trials - Number of runs accumulated
  • Left Y-axis: -Log₁₀(p-value) - lower means more significant
  • Right Y-axis: Odds Ratio - values > 1 favor treatment
  • Blue line: P-value evolution (starts at p=1, becomes significant when crossing threshold)
  • Red line: Observed odds ratio (fluctuates due to variance)
  • Gray dashed line: Standard significance threshold (p=0.05)
  • Orange dotted line: Doubt-adjusted threshold (stricter early, standard later)
  • P-value starts at 1 (no evidence)
  • Gradual decrease as evidence accumulates
  • High variance makes detection harder
  • Odds ratio fluctuates around true value
  • Doubt index discounts early significance due to luck factor
  • Sustained evidence required for confirmation (EG: 5 consecutive trials)

How to Read This Chart


Real-World Examples

  • Control: 20% conversion ± 20% variance (16-24% range)
  • Treatment: 30% conversion ± 20% variance (24-36% range)
  • Result: May take 200-500 trials to detect significance
  • Control: 20% open rate ± 10% variance (18-22% range)
  • Treatment: 25% open rate ± 10% variance (22.5-27.5% range)
  • Result: May take 100-300 trials to detect significance
  • Control: 10% conversion ± 30% variance (7-13% range)
  • Treatment: 20% conversion ± 30% variance (14-26% range)
  • Result: May take 500-1000 trials to detect significance

How to Do This Properly on Your Own

Understanding Sequential A/B Testing

Key concepts for implementing sequential A/B tests:

  1. Statistical Tests: Chi-square test for comparing proportions
  2. P-value Calculation: Updated after each observation
  3. Doubt Adjustment: Stricter thresholds early to avoid false positives
  4. Monte Carlo Simulation: Run many trials to understand distribution

Implementation Steps

  1. Set Parameters: Control effect, treatment effect, variance for both
  2. Simulate Observations: Generate data with variance for each trial
  3. Accumulate Data: Keep running totals for both groups
  4. Calculate Statistics: P-value, odds ratio, rate ratio
  5. Check Significance: Compare to adjusted threshold
  6. Repeat: Run many simulations to get distribution

Key Considerations

  • Variance Matters: Higher variance requires more trials
  • Early Stopping: Use doubt-adjusted thresholds to avoid false positives
  • Budget Planning: Use median/percentile of trials-to-significance for planning
  • Multiple Testing: Adjust for multiple comparisons if running many tests

Has This Helped You?

If you found this A/B test simulator useful:

  • Share it with your team or on social media
  • Bookmark this page for future reference
  • Write backlinks to this page when referencing sequential A/B testing

This simulator provides insights into:

  • How A/B tests actually progress in practice
  • The impact of variance on trial requirements
  • Sequential testing strategies
  • Budget planning for experiments

Starting Point

  1. Statistical Testing: Chi-square test for comparing two proportions
  2. Sequential Updates: Recalculate p-value after each observation
  3. Variance Modeling: Simulate observations with specified variance
  4. Doubt Adjustment: Implement stricter thresholds for early trials
  5. Monte Carlo: Run many simulations to understand distribution of outcomes

You can reference the calculator component to understand:

  • How to simulate observations with variance
  • How to calculate p-values sequentially
  • How to implement doubt-adjusted thresholds
  • How to visualize trial progression and distributions

This approach provides a realistic view of how A/B tests actually work in practice, helping you understand the uncertainty and variance inherent in real-world experimentation.