Understanding Incrementality Testing: What It Is and Why It Matters for B2B SaaS Marketing

June 5, 202616 minute read

Understanding Incrementality Testing: What It Is and Why It Matters for B2B SaaS Marketing

Your ad platform dashboard tells a compelling story. Conversions are up, return on ad spend looks strong, and every channel appears to be pulling its weight. But here is the question that should keep every B2B SaaS marketer up at night: how much of that revenue would have happened anyway, even if you had never run a single ad?

This is not a hypothetical concern. Ad platforms are built to report impressive numbers, and their attribution models are designed to claim credit for as many conversions as possible. The result is a systematic overstatement of advertising impact that leads growth teams to scale spend on channels that look productive in reports but are actually just intercepting demand that already existed.

Understanding incrementality testing is the antidote to this problem. It is the discipline of measuring true causal lift: the actual increase in conversions that happened because of your advertising, above and beyond what would have occurred without it. Where standard attribution tells you which touchpoints were present before a conversion, incrementality testing tells you which touchpoints actually caused it. That distinction is everything when you are managing a paid budget and trying to prove real ROI.

This guide is written for growth leaders and marketing managers at B2B SaaS companies who are comfortable working with data but want a clear, practical framework for applying incrementality thinking to their measurement strategy. We will cover how the methodology works, where traditional attribution falls short, how to design tests that produce reliable results, and how to translate findings into smarter budget decisions.

The Difference Between Correlation and Causal Lift

Standard attribution models have a fundamental limitation that is easy to overlook when dashboards are full of green arrows. They observe which touchpoints were present in the customer journey before a conversion occurred, then assign credit to those touchpoints. What they cannot do is determine whether those touchpoints actually influenced the decision to convert.

Think about what this means in practice. A prospect at a B2B SaaS company has been researching your product category for weeks. They have read comparison articles, watched demo videos, and bookmarked your pricing page. They are, in other words, already deep in the buying process and moving toward a decision under their own momentum. Then they see a retargeting ad for your product on LinkedIn. Two days later, they sign up for a trial.

Your attribution model records this as a conversion influenced by that LinkedIn ad. The platform reports it as a win. But here is the honest question: would that prospect have converted anyway, even without seeing the retargeting ad? In many cases, the answer is yes. The ad did not create the intent. It simply appeared at the moment when intent was already high.

This is the core problem that incrementality testing is designed to solve. Incrementality in marketing is defined as the measurable lift in conversions that was directly caused by an ad exposure, above and beyond what would have happened organically without it. It is not about whether someone saw your ad before converting. It is about whether the ad changed the outcome.

The distinction matters enormously at the budget level. If a retargeting campaign is claiming credit for conversions that would have happened through organic search or direct traffic, scaling that campaign does not generate more revenue. It generates more spend against an audience that was already converting. The appearance of performance is real. The incremental value is not.

This is why the concept of a counterfactual is so central to incrementality thinking. A counterfactual asks: what would have happened in a world where this ad never ran? That question cannot be answered by looking at attribution data. It requires an experiment, specifically one where a comparable group of people does not see the ad, so you can observe what happens to them as a baseline.

Correlation tells you that conversions happened after ad exposure. Causal lift tells you that conversions happened because of ad exposure. Building your budget strategy on correlation alone is one of the most common and costly mistakes in paid marketing.

How Incrementality Testing Actually Works

The methodology behind incrementality testing is borrowed from experimental science, specifically the randomized controlled trial framework. The logic is straightforward: if you want to know whether an ad causes conversions, you need to compare what happens to people who see the ad against what happens to comparable people who do not.

The standard approach is called a test-and-holdout design. Your target audience is split into two groups. The test group receives normal ad exposure. The holdout group, typically somewhere between 10 and 20 percent of the total audience, is withheld from seeing the ads entirely. Both groups are tracked over the same time period, and their conversion rates are compared at the end of the test.

The incremental lift is calculated using a simple formula: take the conversion rate of the test group, subtract the conversion rate of the holdout group, then divide that difference by the conversion rate of the test group. The result is expressed as a percentage and represents the share of test group conversions that were genuinely caused by the advertising. If the test group converts at four percent and the holdout group converts at three percent, the incremental lift is 25 percent. That means roughly one in four conversions in the test group was driven by the ad. The other three would have happened anyway.

There are two primary ways to structure the holdout, and the right choice depends on your campaign type and the channels you are testing.

Geo-based holdout tests work by turning off advertising in specific geographic regions while keeping it running in comparable regions. You then compare conversion rates between the two geographies over the test period. This approach is well-suited for brand awareness campaigns and for channels where user-level holdouts are technically difficult to implement. The trade-off is that geographic regions are never perfectly comparable, so there is always some noise in the results.

User-level holdout tests randomly assign individual users to either the test or control group within an ad platform. Both Meta and Google support this natively through their own experimentation tools. This approach is more precise because randomization at the user level creates cleaner comparison groups. For B2B SaaS companies running targeted demand generation campaigns on these platforms, user-level holdouts are generally the more reliable choice when volume allows.

One important nuance for B2B SaaS specifically: the conversion event you measure matters. If your test only tracks form submissions or trial sign-ups, you are measuring early funnel lift, not revenue impact. Given that B2B sales cycles often run 30 to 90 days or longer, a holdout test needs to run long enough to capture downstream pipeline and closed-won outcomes. Measuring only immediate conversions will give you an incomplete and often misleading picture of true incremental value.

Where Traditional Attribution Models Fall Short

Attribution models are observational tools. They look at the sequence of touchpoints in a customer journey and distribute credit across those touchpoints according to a set of rules. The problem is that no attribution model, regardless of how sophisticated it is, can answer the causal question at the heart of incrementality.

Last-click attribution is the simplest example of this limitation. It assigns all conversion credit to the final touchpoint before a conversion, which is often a branded search or a direct visit. This systematically over-credits channels that capture existing intent while ignoring the earlier touchpoints that may have actually shaped that intent. A prospect who saw a LinkedIn ad three weeks ago, engaged with a retargeting campaign, and then searched your brand name directly will show up in last-click reports as a direct or branded search conversion. The upstream channels that built awareness and consideration get nothing.

First-touch attribution has the opposite bias. It credits the first interaction, which can over-value top-of-funnel channels while ignoring the nurturing work that happened later in the journey.

Multi-touch attribution is a genuine improvement. By distributing credit across multiple touchpoints in the customer journey, it provides a more complete picture of which channels are participating in conversions. But even multi-touch attribution cannot solve the fundamental problem: it still cannot distinguish between a conversion that was influenced by an ad and a conversion that was going to happen regardless.

Consider a prospect who was already evaluating your product, had already read your documentation, and had already been in contact with your sales team. They also happened to see a display ad during this period. Multi-touch attribution will assign some credit to that display ad. But did the display ad actually change the outcome? Incrementality testing is the only method that can answer that question with any rigor.

This gap between attributed performance and true incremental value has direct consequences for budget allocation. When teams scale spend based on attribution reports alone, they often end up investing more in channels that are capturing organic demand rather than creating new demand. Branded search and retargeting are the two most common examples. Both tend to show strong attributed performance precisely because they reach high-intent audiences who are already close to converting. But the incremental contribution of those channels is frequently much lower than the attribution numbers suggest.

The result is a budget that looks efficient on paper but is actually misallocated in ways that are difficult to detect without experimental measurement. Understanding incrementality testing helps break that cycle by introducing a causal lens that attribution models simply cannot provide on their own.

Designing an Incrementality Test That Produces Reliable Results

Running an incrementality test is straightforward in concept but requires careful design to produce results you can actually act on. A poorly designed test can generate misleading lift estimates that are worse than having no data at all.

The most important design requirement is statistical significance. Your holdout group needs to be large enough that the difference in conversion rates between the test and control groups is unlikely to be the result of random variation. For B2B SaaS companies, this is often the hardest requirement to meet because conversion volumes are inherently lower than in ecommerce. If your campaign reaches a relatively small audience, you may need to run the test longer or use a larger holdout percentage to accumulate enough conversion events for the results to be meaningful.

Test duration is the second critical variable, and it is where B2B SaaS teams most often go wrong. If your average sales cycle is 60 days, a two-week holdout test will only capture early funnel conversions like trial sign-ups or demo requests. It will miss the pipeline and revenue outcomes that represent actual business impact. Your test duration needs to be long enough to observe conversions all the way through the stages of the funnel that matter to your business, which often means running tests for 60 to 90 days or longer.

Clean audience segmentation is equally important. The test and holdout groups need to be genuinely comparable. If there are systematic differences between the two groups, such as one group skewing toward a particular industry or company size, the lift estimate will be distorted. Randomization is the standard way to achieve this, and it is one reason why user-level holdout tests within ad platforms tend to be more reliable than manually constructed audience splits.

There are several design mistakes that consistently undermine test validity. Running a test during a product launch, a major promotional period, or a seasonally anomalous time will contaminate the results with factors that have nothing to do with ad exposure. If your conversion rates are elevated for external reasons during the test period, the lift calculation will be misleading.

Holdout groups that are too small to reach statistical significance are another common failure mode. A holdout group of a few hundred users in a low-volume B2B campaign will rarely generate enough conversion events to distinguish real lift from noise. It is better to run fewer tests with properly sized holdouts than to run many underpowered tests that produce inconclusive results.

Finally, isolating a single variable is essential. If you change your ad creative, your targeting, or your bidding strategy during the test period, you lose the ability to attribute any observed lift to the original variable you were testing. Incrementality tests require discipline: keep everything constant except for the presence or absence of ad exposure in the holdout group.

Applying Incrementality Insights to Budget and Channel Decisions

The real value of incrementality testing is not the lift number itself. It is what that number tells you about where your budget is actually working and where it is not.

When an incrementality test reveals that a channel has high attributed conversions but low incremental lift, it is telling you something specific: that channel is reaching people who were already going to convert. It is capturing existing demand rather than creating new demand. This is common with retargeting campaigns and branded search, where the audience is inherently high-intent. These channels may still have a role in your strategy, but scaling spend on them will not proportionally increase revenue. It will primarily increase cost against an audience that was already moving toward conversion on its own.

Channels with high incremental lift but lower attributed conversion numbers tell the opposite story. These are often top-of-funnel or brand awareness campaigns that touch prospects early in the buying journey. Because they operate far from the conversion event, standard attribution models consistently under-credit them. Incrementality testing can surface the genuine demand creation work these channels are doing, which justifies continued or increased investment even when the attribution numbers look modest.

This is particularly relevant for B2B SaaS companies investing in brand awareness campaigns. Content syndication, podcast sponsorships, and upper-funnel display campaigns rarely show up as direct conversion drivers in attribution reports. But incrementality tests that use geo-based holdouts can reveal whether markets exposed to brand campaigns convert at meaningfully higher rates over time, providing the evidence needed to defend and scale those investments.

The most powerful application of incrementality data is combining it with pipeline and revenue attribution. Knowing that a channel drives incremental trial sign-ups is useful. Knowing that those incremental sign-ups convert to paid customers at a strong rate and generate significant pipeline value is what actually informs confident budget decisions. When you can connect incremental lift at the top of the funnel to closed-won revenue at the bottom, you have a measurement framework that goes far beyond what any single attribution model can provide.

This is where the quality of your underlying data becomes critical. Incrementality tests are only as reliable as the conversion data feeding them. If touchpoints are missing, conversion events are duplicated, or attribution is being credited to the wrong sessions, the lift calculation will be distorted and the budget decisions you make based on it will be wrong.

Building Incrementality Into Your Measurement Framework

Incrementality testing is not a replacement for attribution. It is a complement to it. Attribution models give you a continuous, operational view of which channels and campaigns are participating in conversions. Incrementality tests give you periodic, experimental validation of which channels are actually causing those conversions. Together, they form a measurement framework that is far more reliable than either approach alone.

The practical implication is that you do not need to run incrementality tests on every campaign all the time. A reasonable approach is to run tests on your highest-spend channels periodically, use the results to calibrate your attribution model's credit assignments, and then use attribution data for day-to-day optimization decisions. When attribution and incrementality results align, you can have high confidence in your channel mix. When they diverge significantly, that is a signal to investigate and reallocate.

One prerequisite that cannot be overstated is data quality. Incrementality tests depend on accurate, complete conversion tracking. If your tracking has gaps, such as missing touchpoints because of browser privacy changes, ad blockers, or cross-device journeys that are not stitched together properly, your holdout group's conversion rate will be understated and your lift calculation will be inflated. Server-side tracking and first-party data enrichment are not optional components of a serious measurement strategy. They are foundational requirements.

This is where a platform like Cometly directly supports incrementality measurement. Cometly captures every touchpoint across the customer journey, from the first ad click through CRM events and pipeline stages, and connects that data to real revenue outcomes. By tracking conversions server-side and enriching events with first-party data, it closes the tracking gaps that distort holdout test results. When you run an incrementality test, you need confidence that your conversion data is complete and accurate. That confidence starts with the infrastructure you use to collect and organize that data in the first place.

Combining Cometly's pipeline and revenue attribution with incrementality test results gives growth teams the complete picture: which channels are being credited in the attribution model, which channels are genuinely driving new demand, and how both map to actual closed-won revenue. That combination is what makes it possible to make budget decisions with real confidence rather than relying on numbers that ad platforms have every incentive to make look as favorable as possible.

The Bottom Line on Incrementality

Understanding incrementality testing is not an academic exercise reserved for data science teams at large enterprises. It is a practical, necessary discipline for any B2B SaaS marketing team that is serious about protecting budget and scaling what actually works.

Ad platforms will always report the most favorable version of their performance. Attribution models will always credit touchpoints that were present, regardless of whether they caused anything. Incrementality testing is how you cut through that noise and establish what your advertising is actually doing for your business.

The best place to start is simple: pick your highest-spend channel, design a clean holdout test with an appropriately sized control group, run it long enough to capture your full sales cycle, and measure lift against pipeline outcomes, not just early funnel conversions. The results will almost certainly surprise you, and they will give you a much clearer picture of where your budget is earning its keep.

From there, you can build toward a more comprehensive measurement framework that combines ongoing attribution with periodic incrementality validation. The teams that do this consistently are the ones that make better budget decisions, defend their spend more confidently, and scale their marketing programs on a foundation of evidence rather than optimism.

Ready to build the data foundation that makes incrementality testing reliable? Get your free demo and discover how Cometly connects every ad touchpoint to real pipeline and revenue, giving your team the complete, accurate conversion data you need to run meaningful tests and make confident decisions.

Multi-touch Attribution

First-touch, last-touch, linear, U-shaped — see every channel's true contribution to pipeline and revenue, not Meta's claimed numbers.

Explore multi-touch attribution

Customer use case

Pipeline Attribution

Connect ad spend to opportunities, ARR, and closed-won — across both PLG signups and SLG demos — without rebuilding HubSpot or Salesforce.

Keep reading

Get clear, accurate attribution — and make smarter decisions that drive growth.

Get a live walkthrough of how Cometly helps marketing teams track every touchpoint, attribute revenue accurately, and scale their best-performing campaigns.

Get started Book demo →