Merchant Trust Layerfor Vince · confidential

The trust layer for
agentic commerce.

An honest read: what we've built, what we've proven, what AI agents can and can't do yet, and the two hard problems left. One of them is yours.

The shift

Commerce is moving to agents.

Soon the agent does the buying, not the human. Faced with a thousand merchants for one product, it has to answer one question:

You
say what you want
Your agent
does the buying
1,000 merchants
one product
Which to trust?
the question that decides the sale
Right now

The rails for agent commerce are being built - by everyone.

Discovery

Google UCP (Universal Commerce Protocol, Jan 2026) - the open standard for agents to find products and talk to merchants. Shopify, Stripe, PayPal, Visa & Mastercard in the ecosystem.

Payment

Google AP2 + OpenAI & Stripe's ACP - how an agent gets authorised to pay, and checkout inside ChatGPT.

Agent identity

Visa Trusted Agent · Mastercard Agent Pay - verifying the buyer's agent is legitimate, and the merchant's identity.

Billions in backing, shipping now. Every layer answers "who is the buyer, and can they pay?" - none answers "will this merchant actually deliver?"

One-sided

They verified the buyer - not the seller.

The buyer's agent - verified
Every protocol checks it - Google UCP, Visa TAP, Mastercard Agent Pay, OpenAI / Stripe ACP. "Is this bot allowed to pay?" Answered.
The seller - unanswered
"Will this merchant actually deliver?" None of them ask. The other half of trust simply isn't built.

Verifying the buyer's agent is necessary - and they've done it well. But trust takes two: the agent and the merchant - and only one side is built. "None of the protocols address merchant reliability or fulfilment" - confirmed across all six, from their own specs.

The gap

Trust is handled today - by proxies.

Each is a stand-in for the thing that actually matters, and each breaks the moment an agent can check:

Reviews & star ratingsgameable sentiment
A recognised brand namea name - and eroding
"Google Merchant" / "Stripe" badgesidentity, not delivery
Measured reliability - does it actually ship, refund, resolve?still empty

They wired the whole house - discovery, payment, agent-identity - and left one socket with nothing in it: does the merchant actually deliver.

Why there's room

The giants are centralizing trust - and siloing it.

Every dominant player is building its own walled garden of merchant data and locking it inside its own stack. That leaves a gap right in the middle:

Inside the silos
Google, Amazon, each platform scores merchants for its own agents, in its own walls. No rival agent trusts a competitor's say-so - and none is neutral.
The grey in between
A neutral, cross-platform layer no silo owns - the one trust signal every agent can read, precisely because no platform controls it.

Neutrality isn't a nice-to-have - it's the position the giants structurally can't occupy. That's the room we're built for.

Why merchants want this

We earned a reputation and Google deleted it.

Google ties a Business Profile - and its reviews - to a physical storefront. KAAL went online-only, and both vanished with it.

KAAL
Lingerie store · Cape Town
★★★★★
4.8 · 127 Google reviews
● Visible · trusted · choosable
closed the
shopfront
KAAL
Online store · no address
★★★★★
4.8 · 127 reviews
✕ No physical location → profile & reviews gone

Trust was pinned to a shopfront, not to whether KAAL delivers - so closing the shop erased it, and a whole sales channel with it. Merchants want reputation tied to performance: owned, portable, impossible to revoke.

The build

A reputation the merchant owns.

Merchant Reliability Recordsigned & verifiable
99reliability
Order fulfilment97.96% measured
Dispute rate0.063% ✓ Stripe
Refund rate1.8% measured
Order history42 months
Ed25519-signed · issued by independent sources · owned by the merchant

Built from real sales, deliveries, refunds and disputes over time - measured from what actually happened, not declared by the merchant. Hard to game: the facts are attested by independent sources - you can't fake what Stripe signs. Graded, so an agent can choose between merchants.

What we did

We didn't theorise it. We tested it.

Built it on a real store

An engine on KAAL's real data - 42 months, 3,183 orders, 97.96% fulfilment, 0.06% disputes - turning real outcomes into a signed, tamper-evident record. No mockups.

Tested it adversarially

Put it in front of real AI agents (Claude, GPT-4o, Gemini) choosing between four real merchants. Pre-registered. Each test harder and more adversarial than the last.

Six rounds, then a seventh: show the signal, make them find it, make them verify it, test whether they care who signed it - and whether a verifiable nobody beats a trusted name.

What we found

Agents can't tell looks-verified from is-verified.

The experiments landed on one thing: agents reward whatever looks independently verified - they can't yet tell a real credential from a convincing fake. That cuts both ways, and it's the whole game:

The problem
A fake that looks legit works nearly as well as the real thing - so trust is gameable, and cheap badges still win.
The opening
Be the one that's actually verifiable - and win the moment agents can tell the difference. The capable models already reject the fake; every generation gets stricter.

The whole bet: build the trust that survives the day agents stop being fooled - and that day is coming fast.

What we provedresult

We put it in front of real agents. Here's what they did.

The appearance moves them

A signal that looks independently verified took the pick 0% → 92%; the same numbers self-declared got 3%. It's the appearance of verification doing the work.

They read the content

Bad numbers repelled every model - 0%. Not blind badge-love; they read what it says.

They can't tell real from fake

A forgery with the best numbers fooled the weak model 93%; only the capable models, given a way to check, rejected it (0%).

But the curve is real

Each generation discriminates harder - a verifiable fact-chain beat a recognised name 100% vs 31% on the models that matter.

Honest scope: a controlled sandbox, the LLMs behind real agents as proxies, the counterparty signing simulated. Directional and pre-registered - not live agents on the open web.

Per agent

The average hides a spectrum.

"0 to 92%" is the pooled number - but the three models behaved very differently (Claude 75%, GPT-4o and Gemini 100%), and the gap between them is the capability curve, visible right now:

Claude · hardest to fool

Rewarded only what it could verify. Gave an unknown name, a forgery, and a shiny-but-irrelevant badge all 0%. The bellwether for where agents are heading.

Gemini · in transition

Mixed - fell for a shiny badge, but on the structural test rejected the name and the forgery (0%) and trusted only the verifiable chain (100%).

GPT-4o · easiest to fool

Trusted a made-up name 93% and a forgery 93%. The "nominal trust" foil - it shows what looks-verified gets you on a weak model.

Read them left to right and you're looking at the trend: every new generation moves toward Claude - rewarding verifiable substance, refusing fakes. We build for that model.

What agents can do - today

More than you'd think.

Fetch and read a published signal themselvesunprompted, with no instruction to care about trust
Reward independent verification over a merchant's own claimsame numbers: verified moved the pick to 92%, self-declared got 3%
Reject fakes and bad numbers - the capable models, every timea forgery with the best numbers in the test still got 0%
Corroborate a fact by checking the original public sourcefetch the courier's own tracking, the registry, the cert log
What they can't do - yet

And some things they genuinely can't.

?
Verify a cryptographic proof in their headthat's exact maths - a language model would guess, and might say "valid" wrongly
?
Find a signal blindblind discovery was 0% - it has to live where they already look, i.e. schema
?
Trust a name they don't recognisean unknown issuer scored 47%, and 0% on the frontier model - brand is a game a newcomer loses

Each limit shapes the product: live in schema, lean on what they can check themselves, and never bet on being a recognised brand.

What we predict

And where they're heading.

The capability curve

Every model generation trusts a name less and demands verifiable proof more. The frontier model is already there - it trusts nothing it can't check.

The rails are arriving

The tools to verify signed facts are already shipping - for payments and agent-identity (Shopify, Cloudflare, Visa, Mastercard). Merchant reliability is the next slot.

We build for where agents are going, not where they are. The same thing that makes brand-trust fail is what makes verifiable trust necessary - and it compounds in our favour.

What that demands

So the system has to be three things.

Tool-less

It works with the one tool every agent already has - fetch - not a verifier they don't have yet. Tools are coming; we don't wait for them.

In the schema

It lives where agents always look - in the page's schema. Blind discovery was 0%; off that path, it's invisible.

Self-verifying

It communicates verification - points the agent at what it can check itself by fetching the source. Never "please believe this is verified."

Today that means corroboration the agent runs itself; as agents get verifier tools, the same record's signature pays off. Built for both - facts the agent can verify, and an identity the merchant can't shed.

The ecosystem

A trust file on every merchant - thin by default, rich when claimed.

Every store starts with a thin file - a Wikidata-style record built from public data. When the merchant connects their store, we ingest the real numbers with a web-proof and it becomes a thick, measured file - published as a live, signed credential agents read and verify.

Thin file
every store, public data
Connect
merchant links Shopify / Stripe
Web-proof
real data, proven from Stripe
Thick file
measured + verified
Live credential
signed, in schema, agents read it

The thin file means the bureau is never empty - every merchant has a record before they ask, and a reason to claim it. Claiming + connecting is what turns a guess into proof.

What's required next

Two problems.

1 · Web proofs

We pull from Stripe/Shopify and prove it came from the real source, unaltered - via zkTLS / TLS notarization. No cooperation from Stripe, no trusting us. The agent verifies provenance itself. The verifiability pillar - "not a bureau."

2 · Persistent identity

Bind the record to a KYC'd entity + beneficial owners so it can't be shed or faked (anti-phoenix), ZK-private. Inherits the rails' KYC + MATCH. The non-replicability pillar - the moat.

Web proofs make it trustable without a name. Persistent identity makes it stick and un-copyable. Together: an agent can trust a merchant it's never heard of.

The mechanism

The chain that replaces "trust the bureau."

Provenance
web-proof: really from Stripe, unaltered
Integrity
append-only log: not changed after
Computation
open score: agent recomputes itself
Freshness
live, re-proven on a cadence

The agent trusts "Stripe's real records, unaltered, scored by a function I ran myself" - more than "BureauX says 97%," because it checked every link instead of believing a name.

An open fork

Where the canonical truth lives - a choice we're making in the open.

All three share one spine: counterparty-signed facts an agent checks for itself, never trusting us. They differ only on where the record lives and how it's owned - a deliberate, still-open decision:

1 · Public ledger

Bitcoin-style. Records on a public chain - history can't be rewritten, anyone reads it. Owned via a token. Maximal neutrality; highest complexity and crypto / regulatory baggage.

2 · Signed credentials

Off-chain crypto. API-signed facts + zkTLS web-proofs + a KYC'd identity. Security is economic, not on-chain. Fast and uses today's rails; leaves "who hosts the log" open.

3 · Hybrid

Signed + a public transparency log. Fork 2's rails and economics, anchored to a publicly auditable log so no one - including us - can rewrite history. More to stand up; token optional later.

Whichever wins, the rule is the same: trust the math and the economics, not the name. We'd rather show the fork than pretend it's settled - the spine holds in all three.

The economics

A faked merchant is worthless in days.

The deepest defence isn't the check - it's that being legit is the only move that pays. A verified identity with real signed history is slow and expensive to build, and the moment it's abused the same signed record exposes it:

Costly to build

A trusted record is months of real, counterparty-signed outcomes - deliveries, low disputes, settled orders. No shortcut: the facts come from Stripe and the courier, not the merchant.

Worthless in days

Start scamming and the signed facts turn against you - disputes and failures are logged by the same counterparties. Reputation that took months to build is burned in a week.

Can't re-spawn

Identity is bound to a KYC'd entity + beneficial owners (anti-phoenix). You can't shed the burned record and re-appear clean - the rails' MATCH list follows you.

So the economics do the enforcement, not a policeman: a few days of fraud, then the asset is dead and un-rebuildable. For any real merchant, staying honest is worth more than cheating - which is what makes the signal trustworthy.

Where we've got to

The proof problem is mostly solved. The interesting part isn't.

Tractable: the web proof

For a connected merchant, off-the-shelf tools already prove the real Stripe / Shopify data is genuine and untampered - with no help from the platform. We researched it: days of work, not research.

The real invention: the record

A signed merchant reliability record that rides the same rails agents already check - and an identity that makes it un-fakeable. Nobody has built this. It's the seat.

Honest: the deep data needs the merchant's consent, and it's trust-minimised, never trustless - the agent trusts a neutral witness, never the merchant and never us.

What we still need to figure out

Four open problems.

Persistent identity
The big one. A reputation that can't be shed or faked when a bad merchant reopens under a new name - bound to a real, KYC'd identity, but privacy-preserving. This is what makes the whole thing un-copyable.
The verifier rail
Ride the signing standards agents already check (so verification is free), rather than ship a verifier nobody installs.
Coverage
A thin, observed record on every merchant before they opt in - the cold-start solver and the longitudinal moat ("watched since 20XX").
The timing bet
Agents don't fully weigh reliability yet. We're deliberately early - the wedge pays while the data moat accretes.
How we find out

The rest is experiments - and most we can run now.

The open questions aren't theory, they're tests. And the real test each time isn't "does it move the pick" - agents reward anything that looks verified - it's "does the agent reject the fake."

We can run now (sandbox)
Does each mechanism - a verifiable proof, a staked bond, multi-source corroboration - move agents AND survive faking? Does a thin, observed-only file move them at all? How does each model behave in a simulated rail? Days, not research.
Needs the real world
Each agent inside its live rail (Gemini/UCP, ChatGPT/ACP) - that's an integration, not a test. Does the signal get ingested once it's published live? A paying non-KAAL merchant. Does it hold at scale, now - the timing bet.

The behaviour tests are fully in our hands today. The live-rail and at-scale questions need integration and time - not another sandbox run.

What we do next

The lead bet is a stack - built together.

Not one thing, and not four competing things - three layers you build together, plus a trend. The competing options live inside the layers, mostly the mechanism.

1A · Mechanism

How the data's made verifiable. Variants: direct read + sign (cheapest, built) → counterparty-signed → self-hosted zkTLS → TEE. The real choices live here.

1B · Identity

The merchant's passport. did:web (built) → + KYC'd entity (anti-phoenix) → + cross-platform resolution.

1C · Delivery

Where the signal lives, per rail. schema / JSON-LD → .well-known → UCP manifest (Gemini) → ACP context (ChatGPT).

1D · Capability curve

Not a build - measured by running 1A/1B/1C across model versions and charting skepticism vs capability.

Mechanism + identity + delivery, built together; the curve is the trend they ride.

The first move

One experiment tests the whole stack.

Stripe key
a restricted read key
Real credential
direct read + sign (1A-i)
Publish
in schema (1C) + did:web (1B)
Run agents
rail-legibility + control arms, across model versions (1D)

One setup measures mechanism + delivery + identity + the curve at once. Unlocks the entire lead tier with: a Stripe read key + one real test page + agent access.

The seat

An agent should trust a merchant it's never heard of.

By checking, not believing. We've shown the agents already want that, and that the proof problem is tractable. What's missing is the identity that makes it stick - and a second founder to own it. That's the conversation.

← → or space to navigate
Merchant Trust Layer