By Atticus Li – Senior experimentation strategist with 200+ A/B tests across enterprise CRO programs in energy, SaaS, and e-commerce. Creator of the PRISM Method. Learn more at atticusli.com.

When a stakeholder says they don’t see the variant after a one-hundred-percent rollout, the cause is almost never the deploy. In our experience it is one of nine things, and the first thing to check is the dataLayer.

This post is the diagnostic protocol our team uses when a teammate, a QA engineer, or an executive says some version of “I’m on the production page and I’m still seeing the old experience.” If you run experiments in a feature-flag or split-testing platform that lives outside your core codebase – which most experimentation programs do, deliberately – you will get this question every quarter. The first time you run the diagnostic correctly, you save thirty minutes. The hundredth time, you save your team’s credibility.

Before any deploy is questioned, run the audience-condition check. Almost every “I don’t see the variant” report resolves at this layer.

Why This Class of Bug Exists in the First Place

Most experimentation teams operate as a recon unit relative to the core dev team. They ship in days what would be deprioritized for months or years inside the core codebase. They use a dedicated experimentation platform that lives in the page as a script tag, evaluates audience conditions in the browser, and swaps DOM or content based on assignment. This is a deliberate operating model. It allows a small team to move at experiment-velocity while the larger engineering organization handles foundation work.

The trade-off is that winning experiments often live in the experimentation platform indefinitely, waiting for the core dev team’s bandwidth to integrate the variant into the codebase. When that integration happens, the variant moves from “served by experimentation tool” to “served by application code.” Until then, the variant is conditional. It is gated by audience definition, browser state, cache, ad-blocker presence, and the user’s session history.

Each of those conditions is a place the variant can fail to show up for an individual user, even when the rollout is at one hundred percent.

Knowing where to look starts with knowing why the bug class exists. The platform is doing its job. The audience condition is doing its job. The bug, almost every time, is that the user reporting the issue does not match the audience condition for a reason that is not visible from the surface.

The Step-by-Step Debugging Protocol

Step 1: Validate the rollout in incognito

Before any code is investigated, before any deploy is checked, ask the reporter to load the page in incognito or private browsing.

Incognito strips cookies, local storage, session storage, and most browser extensions. It is the gold standard for confirming whether the variant is actually being served. If the variant appears in incognito, the rollout is fine and the issue is on the reporter’s session. If the variant does not appear in incognito, you have a real problem, and the next steps narrow it.

Do this before you do anything else. It will resolve more than half of all reports.

Step 2: Check the dataLayer for the audience key

Open the browser console. Inspect the page-state object that your audience condition is evaluated against – typically a dataLayer-style object exposed on the window. Read the value of the key your test is targeting (a pageName, route, pageType, userType, or whatever your platform uses).

Compare what the dataLayer reports to the audience condition the test was scoped to. If the values do not match, the audience condition is correctly excluding the user.

This is the single highest-yield check. The reporter will often have a session in a state that doesn’t match the test audience, and the test is correctly evaluating “no, this is not a prospect-only homepage view, this is a logged-in homepage view.” The platform is working. The user is in the wrong audience.

Step 3: Check for prior session state

If the dataLayer reports an unexpected value, ask why. The most common cause we see is prior session state that has carried over from another environment. A user who logged into a staging environment weeks earlier may still have cookies or local storage that persist through to production, and those cookies may set an authenticated user state that shifts the page-state object out of the prospect bucket.

Cookies and local storage are persistent. They travel across sessions, across days, across deploys. They are also nearly invisible to the reporter. The user who pings to say they don’t see the variant typically has no idea their browser is still acting on a session from three weeks ago.

The fix is usually the same: incognito reproduces the prospect state, the variant appears, and you have your answer. The deeper fix, if this happens repeatedly, is to scope cookies more tightly across environments or to add a hygiene step to your QA protocol that clears storage before every check.

Step 4: Check for forced bucketing or query-parameter overrides

Most experimentation platforms support a query-parameter override that lets QA force a specific variant for testing. Common patterns include ?expVariant=control, ?force=v1, or platform-specific debug query parameters. Sticky bucketing sometimes also assigns a returning visitor to the same variant they saw the first time, even if the audience or rollout has changed since then.

If the reporter has any of these in their URL or in their browser’s bucketing cookie, they will see whatever was forced or whatever was sticky – not whatever the current audience evaluation says they should see.

Tell the reporter to remove all query parameters and clear the platform’s specific cookie or local-storage bucketing key. Or, simpler, send them back to incognito.

Step 5: Check for cache and CDN propagation

CDN cache, browser cache, and service worker cache can all serve the old HTML even after a one-hundred-percent rollout has been initiated. This is more common when the variant change is implemented as a server-rendered change rather than a client-side script swap, but it shows up either way.

Ask the reporter to do a hard reload (cache-busting reload, not a normal reload). If they have a service worker, that may need to be unregistered manually. If your CDN has region-level cache propagation, the rollout may still be in flight even after the platform reports completion. Check the propagation status before assuming it’s the user’s problem.

Step 6: Check for geographic or audience exclusions

Most production tests carry exclusions for specific regions, DMAs, or user segments – whether for compliance reasons, ongoing local promotions, or other reasons that have nothing to do with this particular test. If the reporter is inside an excluded region or segment, they are correctly excluded.

Reread the audience definition. Confirm whether the reporter is inside any exclusion. If yes, that explains everything and no further debugging is needed.

Step 7: Check for ad blockers and privacy extensions

Browser extensions that block tracking scripts will often block the experimentation platform’s script as well. The platform never loads, the audience condition is never evaluated, and the user sees the control or the unmodified base experience.

Ask the reporter to disable ad blockers and privacy extensions on the test domain, or to test in incognito with extensions disabled (incognito disables most extensions by default, but verify).

Step 8: Check for SDK race conditions

The experimentation platform’s script has to load, evaluate the audience condition, and swap content – often before the user sees the page. If the script loads slowly, or if the variant relies on a DOM swap that fires after first paint, the reporter may see the control briefly and then see the variant, or may see flicker between the two.

This is more common on slow connections, on the first page load of a session, and on pages with large render-blocking assets. If the reporter is reporting “I see the old version for a second and then it changes,” the script is working but the timing is off. That’s a different fix and probably not urgent for this rollout.

Step 9: Check whether the variant has been edited post-test

This is the easiest one to miss. If the test ran a month ago and shipped to one hundred percent, the variant is now living in the experimentation platform indefinitely. During that time, marketing or product may have edited the variant – added new banners, updated copy, refreshed imagery – without touching the test infrastructure.

What the reporter sees at one hundred percent is no longer a one-to-one match with the variant in the test snapshot or the original Figma. That doesn’t mean it’s broken. It means the live experience has evolved past the experimental snapshot.

Document this anytime an experiment lives at one hundred percent for more than a few weeks. When stakeholders ask “is this what we tested,” the honest answer is “the underlying logic is, but the surface content has been updated since.”

The Recon Operating Model and Why It Surfaces These Bugs

A small experimentation team operating outside the core dev cycle is a competitive advantage when it works. It lets you ship a variant in a sprint that would have spent two quarters in roadmap discussions. It lets you keep winning experiments live indefinitely while the core dev team does foundation work that can’t be templated.

It also produces a class of bug that doesn’t exist in monolithic dev workflows: the live variant is in a different system than the rest of the page, and that system has its own evaluation logic. When a stakeholder says “I don’t see the variant,” there is no single deploy log to check. There is the audience condition, the user’s browser state, the cache, the platform’s bucketing, and the post-test edits.

That is not a bug. That is the cost of moving fast. The fix is to write down the diagnostic so the team can run it without rebuilding the protocol every time.

The Nine-Item Debugging Checklist (Copy This)

When someone reports they don’t see the variant at one hundred percent rollout, walk through the list in order. Stop as soon as something resolves the report.

Validate in incognito or private browsing. If the variant appears, the rollout is fine. The user’s session is the issue.
Check the dataLayer for the audience key. Compare against the audience condition. If the value doesn’t match, the platform is correctly excluding the user.
Check for prior session state. Cookies and local storage from other environments (especially staging) carry across to production and shift the user’s audience.
Check for forced bucketing or query-parameter overrides. Sticky bucketing or QA query parameters can override the live audience evaluation.
Check for cache and CDN propagation. Hard reload, unregister service worker, confirm CDN propagation has completed.
Check for geographic or audience exclusions. Reread the audience definition. The reporter may be inside a deliberate exclusion.
Check for ad blockers and privacy extensions. Extensions can block the experimentation script entirely.
Check for SDK race conditions. Script load timing can produce flicker or delayed variant rendering.
Check whether the variant has been edited post-test. Long-lived variants drift from the test snapshot. The live experience may not match the original Figma even though the underlying logic is intact.

When in doubt, return to step one. Incognito is the highest-yield check in the protocol.

What to Add to Your Operations Documentation

If your team is running experiments at any meaningful pace, this debugging tree will be relevant once a quarter at minimum. Two practices reduce the operational tax.

First, write the protocol once and link to it in the same channel where rollout announcements happen. When a teammate pings to say they don’t see the variant, link the protocol back. Half of these reports resolve before they reach the experimentation team.

Second, document any post-test edits to the variant in the same place the original test was reported. When stakeholders later ask “is this what we tested,” the answer needs to be discoverable. Without that documentation, every long-lived variant becomes a slow-moving credibility issue.

FAQ

Why is the audience condition usually the cause?

Because audience definitions are the most consequential piece of test setup and the least visible to anyone outside the experimentation team. The condition is doing exactly what it was set up to do. The user reporting the issue is in a state that doesn’t match it, often for reasons (logged-in cookies, staging carryover, geographic location) that are invisible from the surface.

Should we always validate in incognito first?

Yes. Incognito strips the most common sources of false negatives – cookies, local storage, sticky bucketing, and most extensions. It is the highest-yield single check. Most teams skip it because it feels like asking the reporter to do extra work, but it saves more time than any other step in the protocol.

Why does prior staging state still affect production?

Cookies and local storage scoped at the parent domain can persist across subdomains. If the staging environment shares a parent domain with production, or if the user’s browser stores authenticated state in a way that production reads, a session weeks old can still influence production’s audience evaluation. Tighter cookie scoping fixes this at the platform level, but the operational fix is to validate in incognito.

When should we move a 100% rollout into the core codebase?

When the variant is stable, the team has bandwidth to integrate it, and the cost of keeping it in the experimentation platform exceeds the cost of integration. There is no fixed timeline. Some variants live in the platform for years and that is fine. Others should be integrated within the quarter. The decision depends on the size of the change, the rate of post-test edits, and how often the variant is causing diagnostic load on the experimentation team.

Sources & Further Reading

“Iterative A/B Testing: How a Confounded First Test Becomes a 3-Test Causal Chain” – companion piece on the test arc that produced the rollout in this story.
Kohavi, Tang, Xu. Trustworthy Online Controlled Experiments – on validation and audience integrity in production experimentation.
“AA Testing: The One Hygiene Check Every Experimentation Program Skips” – on validating that the platform itself is trustworthy before reading any test result.

I Don't See the Variant After 100% Rollout: A Debugging Checklist