What is a capability surface in the context of agentic commerce?

A capability surface is a typed algebra of business operations the platform agrees to perform or answer under defined rules. It is distinct from a protocol or an endpoint — it defines what the business can do before any external interface is involved, so that protocol adapters remain thin translators rather than decision-making logic.

Why is starting with a protocol the wrong approach for agentic commerce?

A protocol can expose named operations but cannot define what those operations mean. Without a coherent capability layer behind the adapter, the adapter is forced to assemble business meaning at the edge — inventing pricing semantics, buyer authority rules, and reservation behavior on the spot. That logic belongs inside the platform, not in a protocol translation layer.

What does F[_] represent in a Scala capability trait?

F[_] is a type parameter that represents the effect context — database reads, service calls, policy checks, and audit records that the real world requires. Functional programming does not eliminate those effects; it keeps them at the boundary so the business operation retains a visible shape. Different interpreters can supply different F implementations for production, testing, or simulation.

How does a capability algebra support multiple protocol surfaces?

When each protocol adapter delegates to the same capability algebra rather than implementing business behavior independently, the platform stays consistent as surfaces proliferate. A web checkout, a customer service tool, an MCP server, and a procurement adapter can all call the same evaluatePurchase capability — the algebra describes what can be done, and each adapter handles the translation.

How do you test agentic commerce behavior without running the full platform?

Because the capability surface is independent of the protocol adapter, a test interpreter can exercise the same business operations without requiring live services. A simulation interpreter goes further — it lets teams understand how the platform behaves under supply shortages, fulfillment outages, expired contracts, or unclear buyer authority before granting agents real transaction authority.

Learning Scala: How to Model Capability Surfaces for Agentic Commerce

Jun 8

Written By Tony Moores

Operators preparing to support agentic commerce will eventually have to talk about protocols. That may mean MCP, UCP, a marketplace-specific interface, a procurement network, a partner API, or something that has not yet settled into a standard. If they start with a protocol, the first questions tend to be mechanical. Which tools do we expose? Which existing endpoints map to those tools? What schema does this protocol expect? How do we authenticate the caller? How do we serialize the response? Those are all necessary questions, but they are boundary questions. They deal with the outside edge of the system. The deeper question is whether there is a coherent business capability behind that edge.

New to this series?

Catch up on earlier posts to follow along with the Functional Programming Isn’t Just for Academics series:

Post 1: Why Functional Programming Matters for the Systems We Build Today Post 2: Immutability by Default and the Foundation of Reliable Systems Post 3: Pure Functions: Your First Step Toward Bug-Free Concurrency

Each post in this series explores how teams use Scala to build applications that stay clean, testable, and easy to scale.

Starting with a Protocol Is Starting in the Wrong Place

A protocol can expose a tool called evaluatePurchase, createCart, checkInventory, or submitOrder. It cannot decide what those operations mean. It cannot decide whether the platform is making an estimate, a quote, a reservation, a recommendation, or a commitment. It cannot decide whether the same rules are applied when the caller is a storefront, a procurement agent, a marketplace, a customer service representative, or an automated replenishment job. That is platform work.

Functional programming is useful here because it gives us a way to describe that platform work before we turn it into an adapter. It allows us to define a capability surface as a small, typed algebra of business operations: inputs with clear meaning, outputs with known outcomes, and effects kept visible at the boundary. In less formal terms, it gives us a disciplined way to write down what the business can do before arguing about how a protocol should call it.

What a Capability Actually Is

A capability is not an endpoint with a nicer name. It is not a bundle of endpoints hidden behind another endpoint. It is not a protocol operation by itself. A capability is a business question the platform agrees to answer, or a business action the platform agrees to perform, under defined rules.

Consider the question: given this buyer, acting under this authority, requesting this quantity of this item, for this destination, under these constraints, is there an actionable offer the platform is prepared to stand behind? That is not a product lookup, an inventory check, or a price query. It may depend on all of them, but it is not reducible to any one of them. That is the capability.

In code, the idea can be sketched simply:

Scala

PurchaseIntent => PurchaseEvaluation

This is the shape the business decision should have: structured intent in, structured evaluation out. The live system may need effects to gather facts, apply policy, record the decision, or reserve inventory, but the business meaning should still be expressible as a clear transformation. PurchaseIntent captures who is asking, under what authority, for what item, in what quantity, and with what constraints. PurchaseEvaluation is the platform's commercial answer. The purchase may be possible with an offer. It may be possible only with approval. It may be blocked for specific reasons. It may not be determinable because a required fact is missing or a dependent system cannot answer. The capability is the promise to turn one into the other.

Modeling a Capability Surface in Scala

In Scala, that capability surface might look like this:

Scala

trait CommerceCapabilities[F[_]]:
  def evaluatePurchase(intent: PurchaseIntent):  F[PurchaseEvaluation]
  def reserveOffer(command: ReserveOffer):        F[ReservationOutcome]
  def commitOrder(command: CommitOrder):          F[OrderCommitment]
  def explainDecision(id: DecisionId):            F[DecisionExplanation]

This is not REST, MCP, or UCP. It is the platform saying: we know how to answer these business questions.

Why F[_] Matters

The F[_] matters because the real world is involved. Evaluating a purchase may require database reads, service calls, policy checks, pricing calculations, inventory snapshots, fulfillment estimates, and audit records. Functional programming does not pretend those effects disappear. It asks us to keep them at the boundary so the business operation still has a visible shape. That visible shape is the valuable part. Product, inventory, pricing, fulfillment, entitlement, and policy can remain separate internally. The caller does not have to become the orchestrator of those services. The platform accepts the responsibility for turning internal facts into an external business answer.

The Four Capabilities and Why They're in That Order

The operations in the capability surface are not random verbs. They are stages of trust:

evaluatePurchase answers whether the platform can make an offer. It prevents every caller from reconstructing a purchase decision from product, price, inventory, fulfillment, and policy fragments.
reserveOffer gives the platform a way to stabilize an answer that might otherwise go stale. It answers whether the platform can hold that offer long enough for the caller to act.
commitOrder answers whether the reserved offer can become an order under the caller's authority and payment instruction. This is where the platform enforces idempotency, authority, and order creation rules.
explainDecision acknowledges that delegated execution needs accountability. Success and failure messages are not enough when an agent is acting on someone's behalf with real money and goods.

The platform first evaluates. Then it may reserve. Then it may commit. Later, it may explain.

The Adapter Should Be Boring

If a team starts with MCP or UCP or any other protocol and maps protocol operations directly onto the services it already has, it may produce something callable without producing something coherent. The agent can reach the system, but the adapter is forced to assemble the business meaning at the edge. It becomes the place where pricing semantics, buyer authority, reservation behavior, retry rules, and explanation logic are discovered or improvised. That is backwards.

The adapter should be boring. A request arrives through a protocol. The adapter decodes it into a domain request or command. The capability evaluates or performs the operation. The adapter encodes the result back into the protocol response.

Scala

final class AgentProtocolAdapter[F[_]](
  capabilities: CommerceCapabilities[F]
):
  def evaluatePurchaseTool(input: AgentInput): F[AgentResult] =
    for
      intent <- decodePurchaseIntent(input)
      result <- capabilities.evaluatePurchase(intent)
    yield encodePurchaseEvaluation(result)

The adapter is not deciding what a purchase means. It is not inventing policy. It is not coordinating half the platform by itself. It is translating between an external protocol and an internal capability that already exists. That is the right kind of boring.

One Capability Surface, Multiple Protocol Surfaces

That separation has a practical advantage beyond cleanliness. It means the same capability can be exposed through more than one surface. A web checkout, a customer service tool, a scheduled replenishment job, a REST API, an MCP server, and a commerce-specific protocol adapter may all need to evaluate purchases, reserve offers, commit orders, and explain decisions. If each surface implements those behaviors separately, the platform drifts. If each surface delegates to the same capability algebra, the platform has a better chance of remaining consistent, even as implementations evolve. The algebra describes what can be done. Different interpreters decide how it is done in different contexts.

Testing Capability Behavior Without Running the Full Stack

A production interpreter will need to perform real work, coordinate real services, and produce the typed business results promised by the capability surface. A test interpreter can exercise the same capability without requiring the entire platform to be running. Agent-facing behavior should not only be tested through protocol calls and integration environments. The business promise should be testable directly.

If the platform says a reservation command is idempotent, that behavior should be tested against the reservation capability, not inferred from a happy-path demo. If the platform says every purchase evaluation has an explanation, that should be tested against the capability itself.

A simulation interpreter may be even more valuable. Before allowing agents to execute real transactions, teams may want to understand how the platform behaves under a supply shortage, a regional fulfillment outage, an expired contract, a missing certification, or an unclear buyer authority. If the capability surface is independent of the protocol adapter, simulation becomes a natural extension. The same business question can be asked without changing the world.

Agent readiness is not just about letting agents act. It is also about letting teams understand what agents would do before granting them more authority. No one wants to be surprised when real money and goods are being exchanged, which is why nailing down the interaction model matters more than agreeing on a language or protocol for those interactions. Modeling capability surfaces within a functional programming paradigm is a convenient and extensible way to do it.

This is Part 15 in an ongoing series. If you found this useful, Part 14 looks at why composed commerce stacks struggle to explain themselves when something goes wrong, and how functional programming and event-driven architecture give your system the ability to remember why it did what it did, not just what it did. Read "The Hidden Cost of Composed Commerce Systems"

ScalaFunctional ProgrammingLearning Scala

Tony Moores

Learning Scala: How to Model Capability Surfaces for Agentic Commerce

Starting with a Protocol Is Starting in the Wrong Place

What a Capability Actually Is

Modeling a Capability Surface in Scala

Why F[_] Matters

The Four Capabilities and Why They're in That Order

The Adapter Should Be Boring

One Capability Surface, Multiple Protocol Surfaces

Testing Capability Behavior Without Running the Full Stack

What Does a Senior Scala Developer Cost in the US in 2026?

Learning Scala: The Hidden Cost of Composed Commerce Systems