TL;DR

I analyzed 30+ CTA tests from my own CRO program. Copy variations dominated the portfolio (62% of tests). But placement tests had a 2.4× higher win rate and 2.4× larger lifts. The math says I was running the wrong tests. The reason I kept running them was velocity, not strategy.

What the Portfolio Looked Like

I was running my testing program wrong.

Of 30+ CTA tests I'd shipped over three years, 62% were copy variations ("Get Started" vs "Sign Up" vs "Enroll Now"). The remaining 38% were placement, layout, or design changes.

The win rates told the real story:

Test Type · Win Rate · Avg Lift (winners) · Avg Lift (all)

Test Type: Copy variations · Win Rate: 27% · Avg Lift (winners): +5% · Avg Lift (all): +1.2%

Test Type: Placement / layout · Win Rate: 64% · Avg Lift (winners): +12% · Avg Lift (all): +6.8%

Test Type: Combined (copy + placement) · Win Rate: 75% · Avg Lift (winners): +18% · Avg Lift (all): +9.4%

Placement won 2.4× as often, with 2.4× the lift when it won. Combined tests won 2.8× as often.

Yet I kept building copy tests. Why?

The Velocity Trap

Copy tests are cheap to build. One to two days of design. An hour of Optimizely setup. Two weeks of measurement. Total: one sprint.

Placement tests are expensive. One to two weeks of design (responsive considerations). Engineering involvement (DOM changes). Four to six weeks of measurement (lower per-test variance). Total: two to three sprints.

So in any given month, I could ship 3 copy tests or 1 placement test. Three feels like more progress. One feels slow.

But the math is brutal: 3 copy tests at 27% win rate = expected 0.81 wins. 1 placement test at 64% win rate = expected 0.64 wins.

Per win, copy tests are slightly better. Per impact, placement tests win by 2-3× because winners have larger lifts. Over 12 months: 36 copy tests yield about 10 winners at +5% average = additive impact of roughly 50% lift-years. 12 placement tests yield about 8 winners at +12% = roughly 96% lift-years.

Placement tests deliver nearly 2× more value, with 3× fewer tests.

Why Copy Tests Plateau

In the data, copy tests showed a specific failure mode: small lifts (2-4%) that didn't reach significance. The reason isn't surprising—copy variations test a small mental model shift ("urgency" vs "trust"), within the same UI affordance.

Placement tests change the user's path. Adding a sticky CTA on mobile doesn't just shift mental model—it adds a click target that wasn't there before. Adding a phone CTA above the fold doesn't just nudge—it surfaces a conversion path the user might not have considered.

Copy tests optimize within constraints. Placement tests change the constraints. Constraint changes have larger effect sizes.

The Brand-Specific Twist

Within copy tests, the win rate varied wildly by brand:

Brand · Copy Win Rate · Best Performing Copy Pattern

Brand: DE · Copy Win Rate: 33% · Best Performing Copy Pattern: Action / urgency ("Enroll Now")

Brand: GME · Copy Win Rate: 38% · Best Performing Copy Pattern: Trust / confidence ("Pay Later", "Guaranteed")

Brand: Reliant · Copy Win Rate: 14% · Best Performing Copy Pattern: (none—placement always wins)

Brand: DP · Copy Win Rate: 25% · Best Performing Copy Pattern: Tone variation (formal vs casual)

Reliant copy almost never wins because the brand's UX is the bottleneck, not the words. DE copy wins on urgency because their funnel is short and decision-quick. GME wins on trust because the financial commitment is bigger and the user needs reassurance.

The lesson: if you're going to run copy tests, stratify by brand. Don't assume "Enroll Now" works for DE because it works for GME. The mental model at the moment of conversion is different.

What I'd Stop Doing

Running copy tests on Reliant. Placement and layout instead. The portfolio data is unambiguous: copy doesn't move the needle for this brand.
Running copy tests on long-funnel products (DP). The downstream variance overwhelms the upstream copy effect. By the time the user converts, the words on the CTA are 5 steps in the past.
Running "minor word swap" tests ("Get Started" vs "Sign Up" with the same CTA design). The effect size is below MDE for any reasonable sample. You can run these forever and never learn anything.

What I'd Start Doing

Running placement tests as the default, copy tests as the exception.
Combining placement + copy in single tests where mechanically possible (e.g., "sticky button with new copy" vs "no sticky"). The win rate jumps to 75%.
Stratifying copy tests by brand mental model (urgency for DE, trust for GME, narrative for DP). Same hypothesis applied to the wrong brand is wasted budget.
Requiring a "why this copy" hypothesis in pre-registration. If the only justification is "we want to try X," it's a no-go. Behavioral mechanism required, not curiosity.

The Honest Trade-off

Placement tests are slower. They require engineering. They feel less "productive" in sprint planning. But the impact per test is 2-3× higher, and the program-level ROI compounds.

If you're a CRO program manager, the question isn't "are we running enough tests?" The question is "are we running tests that move the line?"

In our data, the answer was: not really. We were running fast but not far.

The copy test treadmill feels productive because the test count goes up. But test count is a vanity metric. Win-rate × average-lift is the real one. Optimize for that, and the portfolio rebalances itself.

The Copy Test Treadmill (and Why Placement Tests Win 3:1)