Back to Insights

B2B Commerce at Catalogue Scale

300,000 products is a data problem

A consumer shop with fifty products and a B2B platform with three hundred thousand are not the same thing at different sizes. They are different problems. The storefront is the part everyone sees and the part that matters least.

At catalogue scale the product is the data, and the storefront is just a view onto it.

We spent four years on the build and cloud migration of a B2B commerce platform carrying roughly 300,000 products in production on AWS. The lesson that survived is not about a framework. It is that everything hard about the system was a question of keeping a vast, messy, constantly-changing dataset correct.

Where catalogue scale actually bites

B2B pricing is not a number on a product. It is a function of who is asking: contract tiers, volume breaks, customer-specific agreements, regional tax, currency. The same part can have dozens of legitimate prices, and showing the wrong one to the wrong buyer is not a display bug — it is a quote you may be held to.

The data arrives from suppliers in feeds that disagree with each other and change without warning. Units, descriptions, compatibility, stock — all of it drifts. A single mislabelled supersession can route an order to a discontinued part across thousands of downstream listings. Search and filtering have to stay fast and correct over the whole catalogue at once, not over the page the user happens to see.

Nobody notices the 299,000 products that are right. Everybody notices the one that is wrong.

The verification stack carries the catalogue

You cannot eyeball three hundred thousand products. Correctness at this scale is something you have to assert in code and let the machine enforce, which is exactly what the verification stack is for.

Static analysisand a strict type system are the first line: a price that could be null, a currency that does not match the customer's, a quantity that is not a positive integer — caught at the boundary where data enters, not three screens later when a buyer sees nonsense.

End-to-end tests assert the rules that span the whole catalogue: that a contract price always wins over a list price, that a discontinued part can never be ordered, that a feed import with bad rows fails loudly instead of silently corrupting live data. These are the invariants that, once broken, break thousands of listings at once.

Review-friendly architecture keeps the pricing and catalogue logic in explicit, single-purpose modules instead of scattered through templates. When a contract rule changes, the diff is small and the reviewer can see exactly what moved. That is how you change pricing on a live platform without holding your breath.

AI helps at the edges, not at the core

Catalogue work has obvious AI-shaped jobs: normalising messy supplier descriptions, matching duplicate parts, flagging suspicious price changes for a human to confirm. These pay off because a person was always going to check the result anyway, so the verification-cost test comes out in their favour.

What does not pay off is letting a probabilistic model set a binding price unsupervised. A confident-but-wrong answer there is not a typo; it is a commitment. AI belongs on the suggestions and the cleanup, with the deterministic rules — and the tests that guard them — owning the number a buyer is actually charged.

Use AI to propose; use the verified rules to decide.

The same discipline, our own books

The 300,000-product platform is the client proof: a four-year engagement, Upwork-verified, still running in production on AWS. The approach is not something we describe and do not practise.

MushRoom runs multi-tenant invoicing and tax reporting where, again, the data has to be exactly right for every tenant or someone files the wrong figure. It deploys daily on the same stack — types, end-to-end tests, small reviewable changes — that lets us touch live financial data without a release-night ritual. The product is incidental. The discipline is the point.

On the Radar

Tools this article names that we have shipped in production.

Scaling a catalogue past the point you can eyeball it?

We build B2B commerce where correctness is enforced in code — pricing rules, feed integrity, and the tests that keep three hundred thousand products honest.
Start a Conversation