A holdout test is a randomized experiment that withholds a marketing treatment (an ad, email, SMS, push) from a defined share of the audience — the holdout group — while a comparable test group receives it. Assignment must be randomized at the user level before the send; post-hoc “lookalike” holdouts reintroduce the selection bias randomization is meant to eliminate. Lift is the effect relative to the untreated baseline:
Some decks divide by treated; state which denominator the number sits on.
Holdouts are the credible read on channels where last-click attribution overstates: branded search, retargeting, email, SMS, push. These touchpoints disproportionately reach users already converting, so platform-reported revenue measures proximity, not cause. The same design inside an ad platform is conversion lift.
The honest read is often uncomfortable. A high-frequency email flow reporting strong platform-attributed revenue-per-recipient may show meaningfully lower incremental revenue on a holdout — the gap depends on the flow, the audience, and existing organic demand. The only honest number is the one a brand’s own holdout produces.
The cost has two meanings. Revenue held out from the sample is what the brand chose not to send. Revenue actually lost is only the incremental portion of that — on low-incrementality channels, a small fraction of the held-out figure.
Holdout size should be chosen from the expected effect size and the sample needed to detect it with adequate power, not a universal percentage. Owned-channel programs commonly run small recurring holdouts in the single-digit-to-low-double-digit range, but that’s practice, not a target. Brands pair them with geo-lift or marketing-mix modeling against MER for paid media, where user-level randomization isn’t practical.