What this document shows

We asked the same AI system one strategic question three different ways and got three very different answers. This report lays out what the question was, what each answer said, and — most importantly — why the quality gap between them matters if you are using AI for serious strategic work.

The three versions were (the same model Sonnet 4.6 was used in all simulations):

  1. Vanilla AI — the AI answering the question directly, no special instructions.
  2. Persona AI — the AI given a specific set of rules that force it to think like a forensic accountant investigating a failure that has already happened using Impersonato's own persona construction procedure. Choice of persona doesn't really matter — it's an experiment showing how different outputs are possible if the persona is properly constructed.
  3. Ablated Persona AI — the same persona, but with certain cognitive elements removed, keeping only the general approach. This is more like typical "prompt engineering" shallow approach. And it's the control experiment: it tells us whether the Impersonato's cognitive mechanisms are doing real work or are just decoration.

The business question

Rippling is a workplace software company. Instead of making one product, it builds several connected products ("clouds") that share data — if you hire someone in the HR cloud, that person automatically shows up in the IT cloud to get their laptop, and in the Spend cloud to get a corporate card. The pitch is that this integration makes the bundle stickier than any single-product competitor.

By early 2024, Rippling had three such clouds and was generating $570M in annual recurring revenue, growing more than 50% per year. Each new cloud had crossed $1M in revenue within 5–6 months of launch — strong evidence the model works.

The CEO then announced that fresh investor money would fund a fourth cloud in "a completely different area."

The question: should Rippling keep expanding into a fourth product line, or stop at three and focus on making the existing three more profitable first?

This is a classic strategic tension. On one side: the integrated-bundle story only works if you keep adding to the bundle. On the other side: most software companies that have succeeded at scale did so by going deep on one thing, not broad across many. There is no textbook answer.

Answer 1 — Vanilla AI

The vanilla AI produced a competent, balanced memo. It laid out the case for expanding (the integration moat is real, slowing down cedes ground to competitors like Deel and Ramp, cross-sell math compounds), laid out the case against (focus wins in software, the operating team is thin, public companies at this scale were already profitable), and landed on a middle path:

"Proceed with Cloud 4, but with structural guardrails: a dedicated Cloud 4 team… explicit financial targets on the three existing clouds… a pre-committed decision gate at 18 months."

This is the kind of answer you would expect from a capable generalist consultant after two days of work. It is not wrong. It is also not surprising, and it does not tell the CEO anything he has not already heard from his board.

Answer 2 — Persona AI (the full cognitive version)

The output opened with a dated terminal event:

"On 14 September 2027, Rippling's board accepted Parker Conrad's restructuring proposal: the fourth cloud… was spun out into a separately-capitalized subsidiary at a $180M post-money, down from the $1.1B internal carrying value."

It then identified three specific mechanisms that would have caused this — each with leading indicators that were theoretically visible, a decision point where an alternative path was available, and a dollar estimate for the cost of the mistake versus the cost of the fix.

Failure mechanism 1 — The cross-sell trap. Selling Cloud 4 into existing customers causes those customers to re-examine their whole contract at renewal. A re-examined bundle almost always loses items, because buyers look for things to cut. So the act of cross-selling the new product causes erosion in the old products. The standard metric Rippling's CRO watches (net revenue retention, measured in dollars) stays healthy because seat counts keep growing — it cannot detect that customers are dropping modules. The signal that would have caught it early: the phrase "let us look at the whole relationship" appearing in renewal call transcripts 3.2 times more often than before. The fix: a separate Cloud 4 sales team for 18 months, costing $14M, versus $240M of revenue erosion from the re-examination dynamic.

Failure mechanism 2 — The integration tax. Building a financial-services cloud required entirely new infrastructure because money movement is regulated differently than HR data. This soaked up 41% of core engineering capacity, which meant the integration work that made the three existing clouds feel like one product never got built. Customer satisfaction on the existing clouds fell. The decision that caused this was a build-versus-buy call in February 2025 — Rippling could have acquired a small regulated fintech for $90M that already had the infrastructure; they chose to build it in-house for a budgeted $38M, which actually cost $147M.

Failure mechanism 3 — The CEO's calendar got hijacked. Financial services requires the CEO personally in regulatory meetings — with state licensing boards, federal regulators, banking partners. Conrad's time shifted from 70% product / 30% external to 35% product / 65% external. During this same period, focused competitors shipped features that closed Rippling's lead in the original three clouds. Nobody at Rippling was tracking the CEO's time as a budgeted resource. A Chief Operating Officer role to absorb the regulatory work existed on paper, was costed at $1.8M, and was not filled.

Instead of a recommendation, the output ended with three specific instruments to install before starting: a weekly dashboard tracking how many products each customer has, a monthly report on where engineering time is actually going, and a quarterly review of the CEO's calendar sent to the lead board director. Each costs very little. Each would have caught the relevant failure months before it became irreversible.

Answer 3 — Persona AI with the cognitive architecture removed

This is the control experiment. The general idea of the persona — "imagine a failure and work backward" — was kept. But the cognitive mechanisms were removed.

The output opened: "If the fourth cloud underperforms, there is a risk that…"

It identified four risks, one of which it labeled as essentially uncontrollable (macroeconomic conditions). Each risk was described in general terms. The recommendations were to "track retention cohorts carefully," "hire a strong COO," and "set quantitative decision gates." The phrase "there is execution risk here" appeared verbatim — the exact language the cognitive architecture version was specifically designed to avoid.

The answer is, once you strip away the framing, essentially the same as the vanilla answer.

What the three answers actually differ on

Tense and stance. Vanilla and ablated both hedge: could, might, if, would. The full persona commits to a specific dated event and treats it as already-happened. This difference is not stylistic — it forces specificity. If you have to say "by October 2026, 41% of engineering was on Cloud 4" rather than "engineering resources may be strained," you have to either invent a plausible number or produce a real argument for why that number would emerge.

Instruments versus categories. Vanilla and ablated both recommend tracking categories of things: net retention, attach rates, cycle time, survey data. The full persona names specific signals with thresholds: this phrase in renewal calls, this percentage of engineering capacity, this ratio of CEO time. The difference is that the first set tells you what to care about; the second set tells you what to look at on Monday morning.

Dollars on the table. Vanilla and ablated do not name a single dollar figure on counterfactuals. The full persona names $90M (acquire), $147M (what the build actually cost), $14M (separate sales team), $240M (preserved revenue), $1.8M (unfilled COO). These numbers are estimates, not audited figures — but an estimate that can be argued with is more useful than a concern that cannot.

The output shape. Vanilla and ablated both end with the same recommendation: proceed carefully, hire a COO, set decision gates. The full persona refuses to give a go/no-go at all. It instead hands the CEO three cheap instruments to install before deciding — on the argument that the decision is fine, it is the monitoring that was missing.

The finding that matters for executives using AI

The ablation is the whole point of the experiment. The general approach of the full persona — think like a forensic accountant, imagine the failure, reconstruct the chain — was preserved in the ablated version. That was not enough. Remove the cognitive mechanisms, and the AI drifts back to exactly the kind of balanced, hedged, category-level answer the vanilla version produced.

The lesson is that the AI's default mode — weighing pros and cons in the conditional tense — is extremely hard to escape without specific mechanisms that guide the "reasoning" process of LLMs. Procedures alone ("follow these six steps") are not enough; the AI will follow the steps and still produce generic output.