Learning Scala: Large Action Models Need Small Action Surfaces

A language model can answer a question. An action model can do something about it. That distinction is going to matter more than most commerce platforms are ready for. Whether the term is "agent," "large action model," "task-specific AI," or something else that marketing teams have not finished sanding down yet, the direction is clear enough: software will not merely describe commercial options to buyers, operators, partners, and employees. It will increasingly select actions, call tools, negotiate constraints, and attempt to move workflows forward.

New to this series?

Catch up on earlier posts to follow along with the Functional Programming Isn’t Just for Academics series:

Each post in this series explores how teams use Scala to build applications that stay clean, testable, and easy to scale.

In commerce, "move the workflow forward" is not a harmless phrase. It might mean reserving inventory. It might mean accepting a substitution. It might mean committing an order, issuing a refund, changing a shipping promise, requesting a quote, applying a contract price, cancelling a shipment, or escalating a return. These are not generic software actions. They move money, create obligations, expose data, affect inventory, alter customer trust, and should leave evidence behind for someone else to explain later.

Why a Broad Action Surface Is a Platform Problem

If a model is going to act inside a commerce platform, the platform should not expose a broad pile of endpoints and hope the model chooses wisely. It should expose a deliberately constrained set of commercial capabilities whose meanings are explicit and whose use is governed before side effects occur. Ideally, that use should be governed by policy: who may do what, under which authority, in which context, and with what obligations.

There is a temptation to think of this as an AI safety problem first, but it is a platform design problem that AI makes harder to ignore. A procurement agent may be allowed to search and compare products but not place orders. It may be allowed to place orders below a threshold but not above it. It may be allowed to buy from approved suppliers but not onboard new sellers. It may be allowed to request quotes but not accept substitutions. It may be allowed to share a shipping address but not employee contact data. It may be allowed to recommend a return disposition but not issue a refund.

Put aside AI and automation for a moment. Consider what kinds of decisions you might trust to a human procurement agent and which you would prefer driven by corporate policy. Where is that policy stated? How often does it change? How do you keep your workforce compliant? Driving actions through a governed capability service is useful whether or not you care about AI. If the platform already has vague actions, scattered policy, ambiguous failures, and poorly separated side effects, humans learn how to work around the system. Action models will industrialize it. They will try more paths, faster, with more confidence, and then generate a persuasive explanation even if the system underneath cannot defend the outcome.

Authentication Is Not Authority

Authentication gives the platform a fact about the caller. Authorization, in the commercial sense, requires a conclusion about the action. That sounds obvious until you look at how many systems blur the distinction. The gateway identifies the caller. A token is valid. A tenant claim exists. A scope says something like orders:write. A role check passes. The request enters the application, and from there the real business judgment starts leaking into whatever service happens to need it first.

A scope like orders:write may still be useful at the technical boundary, but it is too blunt to represent the full decision. The same order operation may be permitted for one buyer, blocked for another, allowed under one contract, escalated under another, permitted below one threshold, and denied above it. Those distinctions are not just token scopes. They are commercial authorities. If those distinctions live inside the incidental code paths of each adapter, interface, or service, consistency becomes accidental.

Coordinating policy, authority, and functionality across composed systems is already difficult. Leaving that coordination to an unsupervised action model is worse.

The Shape of a Governed Intent

The platform has to define the world the model is allowed to act within, regardless of whether the operators are human or not. It should be a governed capability surface: a set of named commercial operations, each with explicit inputs, explicit outcomes, explicit authority checks, explicit obligations, and recorded evidence. The model requests an action. The platform decides whether that action is allowed. Execution follows only after the decision permits it. That sequencing matters.

In a commerce system, deciding and doing are not the same thing. Evaluating whether an order can be placed is not the same as placing it. Calculating whether a refund is allowed is not the same as issuing it. Determining that a substitution is eligible is not the same as committing the customer to it. A safe action surface should make that distinction hard to skip.

Functional programming gives us a useful way to think about the shape of that boundary because it encourages us to model the decision before we perform the effect. A starting shape might look like this:

scala
final case class GovernedIntent[A](
  principal: BuyerPrincipal,
  authority: DelegatedAuthority,
  context:   CommercialContext,
  action:    A
)

The names are doing most of the work. BuyerPrincipal is the actor: a person, an agent, an internal system, or some combination. DelegatedAuthority describes the authority under which that actor claims to operate. CommercialContext contains the facts that change the answer: account, contract, cost center, budget, seller eligibility, geography, payment terms, fulfillment constraints, regulatory concerns, risk signals, and whatever else the business needs to judge the action. The action is the thing being attempted.

This is already more honest than a token and a scope. It says the commercial decision is not merely "who called?" It is: who is acting, on whose behalf, under what authority, in which context, attempting what?

A Decision That Can Say More Than Yes or No

The decision cannot be a Boolean without throwing away meaning. Commerce decisions are rarely just yes or no.

scala
enum GovernedDecision:
  case Permit(obligations: List[Obligation])
  case Deny(reasons: List[PolicyReason])
  case RequireApproval(route: ApprovalRoute, reasons: List[PolicyReason])
  case RequireQuote(reasons: List[PolicyReason])
  case Escalate(reasons: List[PolicyReason])
  case CannotDetermine(gaps: List[EvidenceGap])

A permitted action may carry obligations. A denied action should carry reasons. Approval may be the correct next step rather than an error. A quote may be required before the platform can commit to terms. Escalation may be appropriate when the request is unusual but not invalid. CannotDetermine is the honest answer when the platform lacks enough trustworthy information to decide.

That last case matters more than it looks. A system that cannot determine whether an action is allowed should not pretend the action is approved. But it also should not always pretend the action is forbidden. Missing certification evidence, stale contract data, unavailable freight quotes, unresolved authority claims, or conflicting seller information may require more information rather than a denial. Naming that state gives both the platform and the caller a better path. This is the kind of distinction that disappears when policy is treated as scattered conditionals and message strings.

The Capability That Respects the Decision Boundary

The capability that performs an action should respect the decision boundary. The exact design can vary, but the architecture should make it natural to decide before executing:

scala
trait GovernedCommerceCapabilities[F[_]]:
  def decide[A](intent: GovernedIntent[A]): F[GovernedDecision]
  def execute[A](
    intent:   GovernedIntent[A],
    decision: GovernedDecision.Permit
  ): F[ExecutionResult]

That is one shape, not the shape. A real system might use workflow states, signed policy decisions, approval records, refined command types, or separate capabilities for different commercial actions. The load-bearing idea is the sequencing: execution follows a governed decision, and application code should never confuse a known caller with an authorized action.

Why a Small Action Surface Is an Operational Strategy

This is not FP decoration. The model may be large. The action surface should be small, and that is an operational choice as much as a technical one.

The larger and looser the action surface, the more the model has to infer. Inference is useful when writing prose. It is dangerous when moving money or creating obligations. A commerce platform should not ask a model to infer whether it may refund an order, accept a substitution, bypass an approval, expose a contract price, or ship to a restricted location. It should give the model a narrow vocabulary of actions and make the platform responsible for deciding whether each action is allowed.

This is where functional programming and commerce architecture meet cleanly. Closed action vocabularies are easier to test than open-ended endpoint access. Explicit outcomes are easier to reason about than exceptions and strings. Deterministic decision logic is easier to simulate than workflows whose behavior depends on hidden state and incidental side effects. Policy records are easier to audit than logs assembled after a customer, partner, or regulator asks a question.

A large action model may be able to plan across a broad commercial task. The platform should still require each meaningful step to pass through a governed capability because search is not reserve, reserve is not commit, commit is not refund, refund is not appeasement, and explain is not expose-the-rulebook. Each action has different risk, authority, evidence, and obligations. Treating them as one broad commerce:write permission is exactly the kind of shortcut that feels fine until software starts acting faster than humans can supervise.

Protocol Enthusiasm Will Not Save You

This is also where protocol enthusiasm can mislead. MCP adapters, REST routes, storefront calls, scheduled jobs, procurement integrations, and customer service tools may all become entry points into the platform. That does not mean each interface should rebuild its own interpretation of business authority. The adapter should translate. The governed capability should decide. The implementation should execute only after the decision permits it.

A request arrives through some interface. The platform authenticates the caller. The adapter turns the request into a governed intent. The capability evaluates that intent against authority, context, and policy. The result is encoded back to the caller. If execution is permitted, the platform performs the action and records what happened. That design gives every interface the same business spine. A storefront, a buyer agent, a support tool, and a scheduled process may have different presentation layers and different permissions, but they should not each invent a private theory of commercial authority.

This matters especially for action models because the adapter layer will be tempting. It will be easy to make an MCP tool or similar adapter that calls an internal endpoint directly because the demo looks good. The demo will not show the missing policy boundary. It will not show the missing evidence. It will not show the support case six months later when someone asks why the system accepted a substitution or issued a refund. The demo never pays the interest. Operations does.

A Governed Action Should Leave Evidence Behind

Digital commerce produces disputes, escalations, approvals, reversals, and questions. A governed action needs to produce a record:

scala
final case class PolicyDecisionRecord(
  decisionId: DecisionId,
  principal:  BuyerPrincipal,
  authority:  DelegatedAuthority,
  context:    CommercialContextSnapshot,
  action:     RequestedAction,
  decision:   GovernedDecision,
  decidedAt:  Instant
)

This record is not logging. It is evidence. It preserves who acted, on whose behalf, under which authority, against which context, producing which decision, and when. Execution records can then connect that decision to the effects that followed: reservation, approval request, order creation, cancellation, refund, shipment change, notification, or whatever else the workflow required.

A platform that cannot produce this evidence is asking the organization to trust outcomes it cannot defend. That may be tolerable for low-risk interactions. It is a poor foundation for delegated purchasing, contract pricing, regulated goods, multi-seller commerce, high-value refunds, or any workflow likely to produce disputes.

The Explanation Problem Changes When Agents Enter the System

A caller may need to know why an action was denied, why approval is required, why a quote is needed, or what obligations attach to a permitted action. That does not mean every caller should see the full rulebook. A weakly authorized agent should not be able to probe the surface until it learns exact spend thresholds, seller risk scores, fraud signals, pricing floors, or internal approval logic. A buyer administrator may be entitled to more detail. An internal compliance team may be entitled to still more.

Those differences are easier to manage when the system separates the decision, the evidence, the internal reasoning, the external explanation, and the next action. If the only output is a message string, teams will either reveal too much, reveal too little, or create a new convention for every integration.

What You Should Be Able to Prove About Your Action Surface

Testing a governed action surface is not the same as proving an authenticated caller can reach an endpoint. A platform that claims delegated purchasing, governed ordering, quote negotiation, or return management should be able to prove the commercial scenarios directly:

  • Can an agent evaluate a purchase without committing it?
  • Can it request a quote without accepting one?
  • Can it place an order below its delegated threshold and be routed for approval above it?
  • Can it buy from approved sellers and be denied for unapproved ones?
  • Can it handle expired contracts, missing evidence, restricted categories, stale pricing, and retry after temporary failure?
  • Can it prove that policy precedes execution?
  • Can it prove that decision records exist and link to the side effects that followed?

Those are not QA cases. They are the contract of the action surface.

Where Scala and FP Are Practically Useful Here

Scala and functional programming do not answer governance questions for the business. Someone still has to decide who authors policies, how they are reviewed, how conflicts are resolved, how emergency exceptions work, and who is accountable when policy and business reality disagree.

What they can do is keep the software boundary honest. Authority, context, action, decision, obligation, and evidence become explicit values rather than informal assumptions. Policy evaluation becomes a named step rather than a side effect of whichever service happened to receive the request. Execution follows the decision. Explanations are grounded in recorded facts. Tests target commercial scenarios directly instead of discovering policy through accidental end-to-end behavior.

The industry has a long history of naming the future before it knows how to operate the present. Some agent projects will fail. Some will be cancelled. Some will turn out to be automation wearing a new hat. Some will matter a lot. But the underlying shift is not speculative: software is being asked to act with more discretion. When software acts, authority matters. A governed capability surface is how you give it a place to stand.

For an earlier look at how capability surfaces work in agentic commerce contexts, see Part 15 on capability surfaces and agentic commerce. The modeling patterns for typed error handling are also worth revisiting in the context of the GovernedDecision enum above.

This is Part 16 in an ongoing series. If you found this useful, Part 15 looks at how to design capability surfaces for agentic commerce systems, defining what a model is allowed to do before it acts rather than after. Read "Capability Surfaces for Agentic Commerce"

Frequently Asked Questions

What is a governed action surface in a commerce platform?

A governed action surface is a deliberately constrained set of named commercial operations that a model or agent can invoke. Each operation has explicit inputs, explicit outcomes, authority checks, and recorded evidence. The platform decides whether the action is allowed before execution occurs.

Why is authentication not the same as authorization in commerce systems?

Authentication gives the platform a fact about the caller. Authorization requires a conclusion about the action. A valid token and a passing scope check do not tell the platform whether the specific action is permitted under the relevant contract, threshold, or context. Those distinctions require commercial authority evaluation, not just identity verification.

Why should deciding and executing be separate steps in a commerce system?

Evaluating whether an action is allowed and performing that action are different operations with different risk profiles. Keeping them separate ensures policy is evaluated before side effects occur, makes the decision auditable, and prevents application code from conflating a known caller with an authorized action.

What does CannotDetermine mean in a governed commerce decision?

CannotDetermine is the honest answer when the platform lacks sufficient trustworthy information to decide. Missing certification evidence, stale contract data, unresolved authority claims, or unavailable freight quotes may require more information rather than a blanket denial. Naming this state explicitly gives the platform and the caller a path forward.

How does functional programming help enforce a governed action surface?

Functional programming encourages modeling the decision before performing the effect. With Scala, authority, context, action, decision, obligation, and evidence become explicit typed values rather than informal assumptions scattered across services. Policy evaluation becomes a named step, execution follows the decision, and tests can target commercial scenarios directly.

 
Next
Next

Scala 2 to Scala 3 Migration: Should You Do It Now? (2026)