Shipping Software Into Regulated Finance

The constraint is proof, not code

12 Jun 2026

Write the same feature for a consumer app and for a trading system and the code looks almost identical. What differs is everything that happens before it runs in front of a real position, a real client balance, or a real regulator. In regulated finance the bottleneck was never the typing.

The constraint is proving a change is correct before it ships, not producing the change.

That is why “move fast and break things” is not a strategy you can import here. A bad deploy is not a bug to patch next sprint; it is an incident with a balance attached. The interesting question is not how to slow down to be safe. It is how to stay fast while making correctness cheap to prove.

What “regulated” actually changes

Three things change the engineering, and none of them is the programming language. The first is an audit trail: you have to be able to say who changed what, when, why, and who approved it — years later, to someone who was not there. The second is the cost of a wrong answer. A rounding error in a marketing dashboard is embarrassing; the same error in a margin calculation is a loss event and possibly a breach.

The third is that you rarely build greenfield. The systems are old, load-bearing, and surrounded by integrations nobody fully remembers. Most AI demos assume a blank repository. The real work is the opposite: changing a system that real users and real money already depend on, without taking it down.

A bad deploy is an incident, not a bug.

The verification stack is the product

If proving correctness is the constraint, then the scaffolding that produces the proof is where the leverage is. We treat it as three layers, each catching a different class of mistake before it can reach a human, let alone production.

Static analysis runs first and fastest. Types, linting, and dependency checks reject whole categories of error — a null where a decimal was expected, a currency mismatch, an unhandled branch — in seconds, before a test even starts. It is the cheapest place to be wrong.

End-to-end tests prove the behaviour the regulator and the client actually care about: that a trade books correctly, that a statement reconciles, that a permission boundary holds. They are slower and more expensive to maintain, so they are spent on the paths where being wrong is an incident, not on exhaustively covering cosmetics.

Review-friendly architecture is the layer people forget. Code is read far more often than it is written, and in a regulated shop it is read by an auditor under pressure. Small, single-purpose changes, explicit boundaries, and a clean diff are not aesthetic preferences. They are what makes a review fast enough that velocity survives the controls.

Verification is not the tax you pay for speed. It is what makes speed safe to keep.

Where AI fits, and where it does not

AI-assisted development genuinely helps here — but not in the way the marketing suggests. It accelerates generation, and generation was never the bottleneck. Point an agent at a regulated codebase with no verification scaffolding and you have simply made it faster to produce changes nobody can prove are safe. That is the code-factory problem: output is cheap, and a human gate on correctness is the part that does not automate away.

The verification-cost test from our piece on what AI is good for applies directly: if checking the AI's output costs more than it saves, it is not a good solution. In regulated finance the cost of checking is high by law, so AI pays off precisely where the verification stack already makes the check cheap — and not where it does not.

What this looks like in practice

Most of our experience here is institutional: investment banking and commodities-trading systems where a wrong number is a reportable event, and the work is to ship change into them without ever earning a place in the incident log. The same discipline runs through our own product.

MushRoom, our property-management platform, deploys to production every day — multi-tenant, with tax reporting and automated invoicing where a wrong figure has real consequences for a real landlord. It can deploy daily because the verification stack carries the proof, not a release-night ritual. The point is not the product. The point is that the same approach we apply to a bank is the one we run our own books on.

On the Radar

Tools this article names that we have shipped in production.

TypeScript Cypress Symfony

CI/CD & QA for AI Development — the verification stack in detail The Code Factory Needs a Foreman — why generation is not the constraint Cloud Migration Without Downtime — moving load-bearing systems safely Why AI Agents Need Specifications — discipline that survives a review

Shipping change into a system that cannot go down?

We bring the verification stack — static analysis, end-to-end tests, review-friendly architecture — to teams who have to prove correctness before they ship.

Start a Conversation