Build vs. Buy: The Loyalty Ledger Decision

What this covers

Why a correct loyalty ledger is genuinely hard
The failure modes that bite you in production
The GDPR problem nobody plans for
The true cost of building in-house
What “buying” actually buys you
An honest framework: when to build, when to buy

A loyalty programme looks, from the outside, like a spreadsheet. A customer places an order, they earn some points, the points become a balance, and eventually they redeem that balance for credit. How hard can it be? You could prototype the happy path in an afternoon: an orders table, a points integer on the customer row, an UPDATE customers SET points = points + 100 on each order. Ship it.

That prototype is also exactly how loyalty programmes end up in the news for the wrong reasons — balances that drift, points that vanish, customers who notice the discrepancy before you do, and a finance team that cannot reconcile what was issued against what was redeemed. The gap between the afternoon prototype and a system you can stake a brand and a balance sheet on is enormous, and almost all of it is invisible until you are already in production with real money on the line.

This article is the case for taking that gap seriously. It is written from the perspective of having built the thing — not to scare you off building, but to make sure that if you build, you build with your eyes open, and if you buy, you know precisely what you are buying.

1. Why a correct loyalty ledger is genuinely hard

Loyalty points are a liability. The moment you issue them, you owe something. That single fact moves the problem out of the domain of “CRUD app” and into the domain of accounting systems — and accounting systems have rules that web developers rarely internalise until they get burned.

The balance must be derived, not stored

The naive design keeps a mutable balance field and increments it. This is the original sin of loyalty engineering. A mutable balance has no history: when it is wrong — and it will be wrong, because of a race condition, a partial failure, a buggy migration, or a manual fix gone sideways — you have no way to know why, and no way to reconstruct what it should have been.

The correct model is an immutable, append-only ledger. Every event — an earn, a points-to-credit conversion, a redemption, an expiry, a void, a manual adjustment — writes one or more rows that are never updated and never deleted. Each row records a signed amount, a type, a reference to what caused it, and a timestamp. The balance is not a field you trust; it is a query — the sum of the ledger, optionally accelerated by periodic snapshots so you are not summing billions of rows on every read. If the balance is ever disputed, you replay the ledger. The truth is always recoverable because the truth was never overwritten.

This is the same discipline double-entry bookkeeping has enforced for five centuries: you do not erase a wrong number, you post a correcting entry. A loyalty ledger that allows UPDATE on a balance is a loyalty ledger that cannot be audited.

Ingestion has to be idempotent

Orders arrive over a network, and networks are unreliable. Your commerce platform's webhook fires, your endpoint processes it and awards points, but the response times out before the acknowledgement gets back. The commerce platform, having seen no acknowledgement, retries. Now you have awarded points twice for one order. At scale, with batch ingestion and at-least-once delivery semantics on the event bus, this is not an edge case — it is a daily occurrence.

The only defence is idempotent ingestion keyed on a stable order identifier. The first time you see orderId, you process it; every subsequent time, you recognise it and return the original result without writing to the ledger again. This sounds trivial and is not: the idempotency check and the ledger write must be atomic with respect to each other, or two concurrent retries will both pass the “have I seen this?” check before either writes. Getting this right under concurrency is where a lot of in-house implementations quietly fail.

Points-to-credit conversion must be atomic

Most programmes do not redeem points directly; points accrue, and once they cross a threshold they convert into spendable credit (a currency-equivalent balance). That conversion is the most dangerous operation in the system because it spans two ledgers: it debits points and credits currency in the same logical breath. If the points debit commits and the credit grant fails — process crash, deadlock, network blip — the customer has lost points and gained nothing. If it happens the other way, you have given away credit for free.

This demands a real transaction boundary with complete-unit semantics: you convert whole units only, both legs commit together or neither does, and the operation is itself idempotent so a retry after an ambiguous failure cannot double-convert. You are now doing distributed-transaction reasoning inside what your PM still thinks is a marketing feature.

Cancellations and refunds must reverse cleanly

An order that earned points gets refunded. You must claw back the points — but the customer may have already spent them. Now what? You cannot delete the original earn (the ledger is immutable, correctly). You post a compensating reversal entry, proportional to the refunded amount, and you have to decide and encode policy for what happens when the reversal would drive the balance negative: do you allow a negative balance, claw back from future earnings, or write it off? Each choice has accounting consequences, and the ledger has to represent all of them faithfully and reversibly.

2. The failure modes that bite you in production

Every shortcut in the design above maps to a specific, named way the system fails once real traffic hits it. These are the ones that generate support tickets, finance escalations, and the occasional regulator letter:

Double-counting. Non-idempotent ingestion plus retries equals points awarded multiple times for one order. Customers rarely report being over-credited, so this inflates your liability silently until an audit finds it.
Lost points. A conversion or redemption that debits one side and fails to credit the other. Customers absolutely report this, loudly, and it erodes trust in the programme faster than almost anything else.
Balance drift. A mutable balance that, through accumulated race conditions and partial failures, no longer equals the sum of what was issued and spent. By the time you notice, you cannot tell which transactions were wrong.
Unreconcilable books. Finance asks “how much loyalty liability is outstanding, and does it tie out to what we issued?” and there is no clean answer because the data model was never designed to answer it.
Phantom redemptions under concurrency. A customer with exactly enough balance fires two redemptions at once; both read the balance, both see sufficient funds, both succeed. You just let them spend the same points twice.
Refund reversal gaps. Points clawed back inconsistently — or not at all — when orders are cancelled, leaving liability on the books for revenue that no longer exists.

Notice that every one of these is a correctness failure, not a performance failure. You cannot test your way out of them with load testing; you have to design them out from the start with the right data model, transaction boundaries, and idempotency guarantees. That design work is the hard, unglamorous core of a loyalty ledger, and it is most of what you are actually deciding to build or buy.

Reconciliation and audit are not optional

If points are a liability, someone in finance owns that liability, and at some point an auditor will ask to see it substantiated. A loyalty system that cannot produce, on demand, a complete and tamper-evident history of every entry — who earned what, when, why, and how it was spent or expired — is a system that cannot be audited, which in a regulated context means it cannot be used. Building this in retrofit, after the fact, onto a mutable-balance design is a project in its own right. Building it in from day one, as a write-once audit trail with legal-hold semantics, is the only sane path — and it is real engineering.

Here is the constraint that turns a hard problem into a genuinely subtle one, and the one in-house builds almost never anticipate: the immutable ledger collides head-on with the right to erasure.

GDPR Article 17 gives an EU customer the right to have their personal data deleted. Your loyalty ledger is, by deliberate design, append-only and immutable — you must never delete entries, because deleting them destroys the financial audit trail and breaks every balance derived from it. These two requirements appear flatly contradictory. Many teams discover the contradiction only when the first erasure request lands, and then they either break their ledger or fail their compliance obligation.

The resolution is cryptographic erasure, and it is worth understanding because it is the single best example of why this domain is harder than it looks. You separate two kinds of data:

Personal data (name, email, phone, date of birth) is encrypted at rest with a per-customer data encryption key (DEK), which is itself wrapped by a per-tenant key encryption key (KEK) held in a key vault.
The ledger contains no personal data — only pseudonymous identifiers, signed amounts, types, and timestamps.

When an erasure request arrives, you destroy the customer's DEK. The encrypted personal data instantly becomes unrecoverable ciphertext — erased, for every legal and practical purpose — while the ledger is never touched. Balances still reconcile, audits still pass, the financial history remains intact and tamper-evident, and the person is genuinely gone. You have satisfied Article 17 without violating the immutability that makes the ledger trustworthy.

This is the kind of requirement that does not appear in the original ticket, is impossible to bolt on later without re-architecting, and is non-negotiable if you operate in or sell to the EU. It is also a perfect litmus test: ask any “we'll just build it” team how they will handle a right-to-erasure request against an immutable ledger. The quality of the answer tells you whether they have understood the problem.

Adjacent to this sits a whole family of privacy obligations that are equally easy to underestimate: keeping personal data off the event bus entirely (so it is not silently copied into every downstream consumer and log), scrubbing PII from payloads on a schedule, fulfilling data-subject-access requests, and producing a defensible data-portability export. None of these are exotic. All of them are real engineering that has to be designed in, not sprinkled on.

4. The true cost of building in-house

Engineers are optimists about scope, which is exactly why build-vs-buy decisions go wrong. The cost of building a loyalty ledger is not the cost of the happy path; it is the cost of correctness, compliance, and operating the thing for years. Let us be concrete about where the time actually goes.

The build

The ledger core — immutable append-only model, snapshotting for balance-query performance, atomic transaction grouping, idempotent ingestion. This is months, not weeks, to get correct under concurrency, with the test suite to prove it.
Points-to-credit conversion — atomic, idempotent, complete-unit, with the reversal and refund logic that goes with it.
A rules layer — because the business will, on day two, want earn rates that vary by product, by tier, by campaign, by day of week. Hard-coding these means a deploy for every promotion. A safe, governed rules engine is itself a substantial subsystem.
The GDPR architecture — envelope encryption, cryptographic erasure, a PII-free event bus, DSAR and export tooling.
Reporting — finance needs liability reports, marketing needs programme analytics, both need it without hammering the transactional system.
The operational surface — admin tooling for manual adjustments (with the audit trail and approval controls that make manual adjustments safe), webhook delivery with signing and retries, idempotency-key storage, monitoring for balance drift and reconciliation breaks.

Realistically, a competent team is looking at a meaningful multiple of engineering-months to reach a first production-grade version — and “production-grade” here specifically means it survives concurrency, retries, refunds, audits, and an erasure request. The afternoon prototype is genuinely an afternoon. The trustworthy system is a serious project owned by people who understand financial-grade data integrity.

The part that never ends

The build is the cheap half. The expensive half is everything after launch:

Maintenance and on-call. A financial system that runs 24/7 needs people who can be woken up when reconciliation breaks at 2am, and who understand the ledger well enough to fix it without making it worse.
Compliance drift. GDPR interpretation evolves, you expand into new regions with new data-residency rules, SOC 2 or PCI scope questions arrive with your first enterprise customer. Each is ongoing work, not a one-time box-tick.
Audit support. Every audit cycle consumes engineering time to produce evidence, answer questions, and remediate findings — against a system that has to keep running while you do it.
The opportunity cost. Every one of those engineering-months is a month your best people are not building the thing that actually differentiates your business. Loyalty ledger correctness is table stakes; it wins you nothing competitively. It is pure cost-of-correctness, and you are paying it forever.

5. What “buying” actually buys you

Buying a loyalty platform is not buying a feature; it is buying away an entire category of risk and a permanent maintenance liability. What you are actually purchasing is:

Correctness as a guarantee — an immutable ledger, idempotent ingestion, atomic conversion, and clean refund reversal that someone else has already battle-tested across many tenants and many edge cases you have not thought of yet.
Compliance as a product — cryptographic erasure, a PII-free event bus, DSAR tooling, data-residency options, and the audit evidence that comes with them, maintained as the regulatory landscape shifts.
Time-to-value — provisioning in minutes against months of build, which for most businesses is the difference between launching this quarter and launching next year.
Someone else's on-call — the 2am reconciliation break is the vendor's problem, backed by an SLA, not your engineer's.

The honest catch is that buying means accepting a platform's model of the world and, with a headless platform, still building your own customer-facing experience on top of its API. You trade some control for a great deal of risk and time. For most teams, most of the time, that is a trade worth making — but not all teams, and not all the time, which is the whole point of the next section.

6. An honest framework: when to build, when to buy

Anyone selling you a platform who tells you to always buy is not being straight with you. There are real cases for building. Here is the honest version.

Lean toward building when…

The loyalty ledger is your core product, not a supporting feature. If you are a loyalty-tech company, a payments network, or a coalition operator whose entire value proposition is the ledger, you should own it. It is your moat, and outsourcing your moat is a mistake.
You have genuinely unusual requirements that no platform models — an exotic multi-party settlement scheme, a regulatory regime with no off-the-shelf answer, or integration constraints that make any external system a non-starter.
You have the team and the appetite — engineers who understand financial-grade data integrity, the organisational willingness to fund maintenance and compliance for years, and a clear-eyed acceptance that this is a permanent cost centre, not a one-off project.

Lean toward buying when…

Loyalty is a means, not the end. You are a retailer, a brand, a marketplace — loyalty drives retention but it is not what you sell. Every engineering-month spent on ledger correctness is a month not spent on your actual product.
You need to ship this quarter. The business wants a programme live before the holiday season, not a multi-quarter platform project.
You operate in or sell to the EU. The GDPR-versus-immutable-ledger problem alone is a strong signal: if you have to ask how cryptographic erasure works, you are better served by a platform that has already solved it.
Audit and compliance matter and you do not want to own them. A bought platform that already carries the controls and evidence is dramatically cheaper than building and maintaining your own.

The one-line test: if a correct, immutable, GDPR-compliant ledger is a competitive advantage for you, build it. If it is merely a correctness obligation you cannot afford to get wrong, buy it — and spend the months you save on the thing that actually makes you money.

LoyaltyOS exists because we believe most teams are in the second camp and have been quietly paying the first camp's costs by accident. The ledger model described in this article — immutable, append-only, snapshot-accelerated, with idempotent ingestion, atomic points-to-credit conversion, clean refund reversal, and cryptographic erasure that satisfies Article 17 without breaking the audit trail — is not a roadmap for us. It is what the platform is. If you want to compare the two paths in numbers, or just see what you would have had to build, the links below are the next step.

Build vs. Buy:The Loyalty Ledger Decision

What this covers

1. Why a correct loyalty ledger is genuinely hard

The balance must be derived, not stored

Ingestion has to be idempotent

Points-to-credit conversion must be atomic

Cancellations and refunds must reverse cleanly

2. The failure modes that bite you in production

Reconciliation and audit are not optional

3. The GDPR problem nobody plans for

4. The true cost of building in-house

The build

The part that never ends

5. What “buying” actually buys you

6. An honest framework: when to build, when to buy

Lean toward building when…

Lean toward buying when…

See what you would have had to build — already built.

Build vs. Buy:
The Loyalty Ledger Decision