Switchback Test — Glossary

Also known as: Switchback Testing, Switchback Experiment, Time-Split Test, Temporal Holdout

A switchback test measures incrementality by alternating a treatment on and off across consecutive time windows for the same population, then reading the on-vs-off difference in the outcome metric as the incremental effect.

A brand might run its branded-search campaign for 48 hours, pause it for 48 hours, turn it back on for 48 hours, and repeat the cycle for six weeks. Aggregate revenue during on-windows minus aggregate revenue during off-windows, normalized for window count, is the measured lift. The same population is its own control across time.

Among incrementality designs, this is the temporal-holdout flavor. A holdout test splits by user inside the channel — the audience version of the same idea. Geo-lift splits by region. Switchback splits by time, which is what’s available when the other two aren’t.

Two conditions push operators toward it. The first is no usable geo split: national-only channels (broad-reach upper-funnel, retention email blasts, app push) don’t carry a geo dimension, and brands too small to support clean geo holdouts can’t hit a meaningful MDE on a region-pair comparison anyway. The second is no platform-side user holdout — the channel doesn’t expose a conversion lift product, or it does but excludes the inventory the brand actually buys. Branded paid search, retention email, transactional SMS, and app push are the channels where switchback most often shows up as the only available design.

The central design risk is carryover. If the treatment effect persists into the next off-window, the off-window outcome is contaminated upward, the measured difference shrinks, and the test biases toward zero. Retargeting is the canonical breakage: users exposed during an on-window keep converting through the next off-window from the same purchase intent, so the off-window looks closer to the on-window than it should. Treatments with a shorter effect half-life relative to the window — a one-shot email blast, a 24-hour push campaign, a branded-search budget that only buys what the user typed today — hold up better, because the lift dissipates inside the window it was generated in.

Window length should exceed the plausible treatment half-life but stay short enough to repeat enough cycles for power. Common operator choices land somewhere in the 24–72 hour window range repeated for 4–8 weeks, but those are conventions, not laws — the right cadence is the one that clears the carryover half-life and still yields enough cycles to detect the lift the brand cares about. Switchback is the small-brand-friendly cousin of geo-lift: same incremental logic, weaker design (one population alternated rather than two compared), available when the alternative is no incrementality read at all.

Related terms

← Back to glossary