Home · Case stories · Finance
Case story · finance · 24 people · 6 weeksA 24-person finance ops firm. Their owner had been the bottleneck for every AI question for six months. We were brought in for a Layer 02 engagement. Six weeks later, she wasn't the bottleneck anymore.
The owner — call her R — had spent the first half of 2025 building agents in Notion. Three of them. One for client onboarding emails, one for invoice reconciliation, one for a "research analyst" that had never quite worked. By Q4, the senior team had quietly stopped using two of them.
Every new AI question landed on R's desk. "Should we try this for refunds?" "Can we use it for the new client intake?" She'd open Notion, see the three half-finished things, and feel the cost of every previous "yes."
The Tuesday she called us, she'd been counting. Eleven different AI surfaces were live in the agency — most of them she hadn't built and didn't know existed. A senior had a Claude project running invoice categorization. Two analysts were using a custom GPT for client summaries. The phantom layer was bigger than the official one.
We didn't pitch. We started with a half-day with R and three seniors, mapping every workflow that had an agent in it or near it. By end of week one we had a single page with eleven entries on it — every one of them named, with an owner penciled in.
Half-day with the owner and three seniors. Every agent, prompt, custom GPT, automation. Eleven surfaces, four owners, two phantom agents nobody could remember building.
Two agents had no owner and no users. We retired them. One had drift on its categorization that nobody had caught. We turned it off and wrote the postmortem with the senior who'd built it.
The refund agent was running unattended on a category that touched compliance. We installed a human-in-the-loop checkpoint, scoped its authority, and wrote the audit trail. Two-day build.
R stopped attending agent design sessions. The seniors picked the next workflow themselves — client intake, with two HITL gates baked in from the start. We reviewed; we didn't build.
A single page their CISO actually read. Layer 03 deliverable. The active agent count went from 11 to 3 — all named, owned, reviewed monthly. R stopped being CC'd on AI questions.
R got fourteen hours a week back — measured, not estimated. The senior team owns agent design now; they review the audit themselves at the monthly ops sync. After week four, exactly zero AI support tickets routed to R.
The agents that survived weren't more sophisticated than the ones we killed. They were owned. That was the difference.
From phantom layer to scoped, owned, reviewable.
Time spent answering AI questions. Now spent elsewhere.
Senior team owns the answers. R reviews, doesn't decide.
No pitch. No deck. No follow-up sequence. If we're not the fit, we say so in the call.