How Execution Infrastructure Works — Architecture and Execution Model

Architectural Definition

Execution infrastructure operates through a layered architecture: an interaction layer receives input, a router determines the execution path, a chain of stateless agents performs sequential operations, a decision engine evaluates business rules, integration adapters act on external systems, and a confirmation layer verifies and communicates outcomes. Each layer is independently scalable, testable, and replaceable. The architecture separates the mechanism of execution from the definition of business logic.

1. Architectural Overview

Execution infrastructure is organized as a five-layer stack. Every interaction traverses the full stack from top to bottom. Each layer has a single responsibility, communicates with adjacent layers through a standardized execution context object, and is stateless relative to other interactions.

The five layers operate in sequence during every interaction:

Execution Infrastructure — Layer Architecture

Layer 1 Interaction Layer — receives voice, SMS, API, webhook, platform events

↓ normalized execution context

Layer 2 Router & State Machine — determines execution path, manages state transitions

↓ activated agent chain

Layer 3 Agent Chain — stateless agents execute in sequence, each with single responsibility

↓ business rule evaluation (interleaved)

Layer 4 Integration Adapters — connectors to CRM, payments, scheduling, messaging

↓ execution results

Layer 5 Confirmation Layer — multi-channel notification, verification, audit logging

The critical design property is that business logic is expressed as configuration, not code. The layers provide the execution mechanism. A tenant's specific services, pricing rules, coverage areas, and routing preferences are defined in a configuration file that the execution engine interprets at runtime.

This separation of mechanism and policy means the same execution engine can serve a home services company, a healthcare practice, and a logistics provider without code changes. Only configuration differs.

2. The Interaction Layer

The interaction layer is protocol-agnostic. It receives input from any supported channel and normalizes it into a standard execution context before passing it to the router. The layer accepts the following input types:

Voice calls — received via telephony APIs (e.g., Twilio). Audio is streamed through a real-time speech-to-text pipeline (e.g., Deepgram) and converted to text. Responses are generated as text and converted to speech (e.g., ElevenLabs) before being streamed back to the caller.
SMS and messaging — received via messaging APIs (Twilio, WhatsApp Business API, Telegram Bot API). Text input is passed directly to the router.
API requests — received via REST webhooks. Structured payloads are parsed and mapped to the execution context schema.
Platform events — received from third-party platforms (e.g., Meta Lead Ads via CAPI, form submission webhooks). Event payloads are normalized into the standard context format.

Each interaction is normalized into a standard execution context containing:

Field	Description	Example
`caller_id`	Unique identifier for the initiating party	`+1-555-123-4567`
`channel`	Input protocol	`voice`, `sms`, `webhook`
`tenant_id`	Target tenant namespace	`vossome`
`intent_signal`	Initial classification of purpose	`service_request`
`raw_input`	Unprocessed input payload	Transcribed text or JSON body
`timestamp`	Interaction start time (UTC)	`2026-02-10T14:14:00Z`

The normalization step is essential. By converting all input types to a uniform execution context, every downstream layer operates identically regardless of whether the interaction started as a phone call, a text message, or an API request. This is what makes the architecture protocol-agnostic.

3. The Router and State Machine

The router evaluates the execution context and determines which agent chain to activate. It functions as a deterministic state machine with defined transitions. The state machine governs the complete lifecycle of every interaction.

A typical state machine defines the following transitions:

State Machine — Execution Path

call_start

→

greeting

→

service_detection

→

coverage_validation

→

quoting

→

contact_capture

→

booking

→

dispatch

→

confirmation

Each state transition is deterministic. The state machine enforces that every interaction follows a complete execution path from input to resolution. This means no interaction can terminate in an undefined state. If an agent encounters a condition it cannot handle or a required resource is unavailable, the state machine reroutes to a defined fallback path rather than failing silently.

The state machine also provides:

Branching — Different agent chains activate for different intent signals. A service request follows a different path than an existing customer status check.
Looping — If a required input is missing or invalid, the state machine can loop back to a previous state to re-collect information.
Escalation — At any point, the state machine can transition to an escalation state that transfers the interaction to a human operator with full context.
Timeout handling — If an interaction stalls (e.g., caller goes silent), the state machine transitions through a defined timeout sequence.

4. The Agent Chain

The agent chain is the core execution mechanism of execution infrastructure. It consists of a sequence of stateless agents, each with a single operational responsibility. The router activates the appropriate chain, and agents execute in order.

Each agent has four defining properties:

Stateless — An agent receives the execution context, performs its operation, and returns an updated context. It retains no memory between invocations. All state lives in the execution context object.
Purpose-specific — Each agent does one thing. One agent detects service type. Another validates geographic coverage. Another generates quotes. No agent performs multiple unrelated operations.
Composable — Agents can be reordered, added, or removed via configuration. A tenant that does not require geographic validation can have the coverage agent removed from their chain without modifying any code.
Independent — Agents do not depend on each other's internal state. They depend only on fields in the execution context. This means agents can be tested in isolation and replaced independently.

Example Agent Chain

greeting

→

detect_service

→

city_gate

→

quote_agent

→

contact_capture

→

closer

→

dispatch

Agent Specifications

Greeting agent. The first agent in the chain. It establishes the interaction context, identifies the caller if possible (via caller ID lookup against the CRM), and makes an initial determination of intent. The greeting agent's output includes a preliminary intent classification and any existing customer data.

Service detection agent. Classifies the requested service from unstructured natural language input. The agent maps the caller's description to a service from the tenant's defined service catalog. For example, "I need someone to clean out my gutters" maps to gutter_cleaning. The classification is based on the tenant's configuration, not a global model, which means each tenant's service catalog can use different terminology.

Coverage validation agent (city gate). Checks whether the tenant services the caller's geographic location. The agent compares the caller's ZIP code, city, or address against the tenant's defined service areas. If the location is outside coverage, the state machine transitions to an out-of-area response rather than continuing to quoting.

Quote agent. Calculates pricing based on the detected service type and applicable business rules. The quote agent reads pricing configuration (which may include tiers, property type multipliers, seasonal adjustments, or custom rules) and generates one or more price points. Pricing logic is defined in tenant configuration as structured data, not as code.

Contact capture agent. Collects and validates customer information: name, phone number, email, and address. The agent applies format validation (e.g., phone number normalization, ZIP code verification) and checks for duplicate records in the CRM via the integration adapter. Validated contact data is written to the execution context for downstream agents.

Booking agent (closer). Creates an appointment or work order based on the accumulated execution context. The agent checks availability via the scheduling integration adapter, proposes available time slots, and confirms the selection. The booking agent writes the confirmed appointment details to the execution context and to the CRM.

Dispatch agent. Notifies the operations team that a new job has been booked. The dispatch agent sends a structured notification (via Telegram, SMS, email, or webhook) containing the full interaction summary: customer name, service type, address, scheduled time, quoted price, and any special instructions. The notification includes all information needed to fulfill the job without requiring the team to look up additional records.

5. The Decision Engine

Between agents, a decision engine evaluates business rules to determine the correct execution path. The decision engine is not a separate layer — it operates as an interleaved evaluation mechanism between agent transitions.

The decision engine answers questions such as:

Does this tenant's service area cover the caller's location?
What pricing tier applies to this service and property type?
Is the requested time slot available, and if not, what alternatives exist?
Should this interaction be handled automatically or escalated to a human operator?
Has this caller interacted before, and if so, what is their history?
Does this service require a deposit, and if so, what amount?

These decisions are defined in configuration, not code. A tenant's business rules are expressed as structured data (typically JSON) that the decision engine interprets at runtime. This means modifying a business rule — changing a coverage area, updating a price, adding a new service — requires a configuration change, not a deployment.

The decision engine evaluates rules against the current execution context and returns a determination that the state machine uses to select the next transition. For example, if the coverage validation returns out_of_area, the state machine transitions to a polite decline response rather than continuing to quoting.

6. Integration Adapters

Each external system is accessed through an adapter — a standardized interface that translates between the internal execution context and the external system's API. The adapter pattern provides a consistent abstraction regardless of the underlying service.

Adapter	Function	External Systems
CRM adapter	Creates and updates customer records, queries existing data, manages lead lifecycle	Firestore, Salesforce, HubSpot
Payments adapter	Creates checkout sessions, processes charges, issues refunds, manages payment holds	Stripe
Scheduling adapter	Books appointments, queries availability, manages cancellations and rescheduling	Google Calendar, custom scheduling
Communications adapter	Sends SMS, initiates calls, delivers messages across channels	Twilio, Telegram, WhatsApp
Notification adapter	Alerts team members, pushes updates to dashboards, sends operational summaries	Telegram, email, webhooks
Voice adapter	Manages real-time speech-to-text and text-to-speech pipelines	Deepgram, ElevenLabs

The adapter architecture enforces an important constraint: agents never interact directly with external APIs. All external access flows through an adapter. This means adding a new integration requires writing an adapter that conforms to the standard interface — it does not require modifying the core execution engine, any agent, or the state machine.

Adapters are loaded via a factory pattern. At startup, the system reads the tenant configuration, identifies which adapters are required, and instantiates them with the tenant's credentials. If a tenant uses Firestore for CRM and another tenant uses Salesforce, both get the CRM adapter interface — the factory selects the correct implementation based on configuration.

7. The Confirmation Layer

Every execution ends with a confirmation phase. The confirmation layer ensures that all parties — the customer, the operations team, and the system of record — have been notified of the outcome.

Confirmation includes four mandatory operations:

Customer confirmation — The customer receives a confirmation of the action taken via their original channel (or a specified alternate). For a phone call, this may be a verbal confirmation before the call ends plus an SMS summary. For an API request, this is a structured response payload.
Operations notification — The operations team receives a notification containing the full interaction context: customer information, service details, scheduling, pricing, and any special instructions. This notification contains all information needed to fulfill the work.
CRM record update — The customer record in the CRM is updated with the interaction outcome, including a transcript or summary, the service booked, the price quoted, and the appointment time.
Execution trace logging — The system logs the complete execution trace for audit purposes: every state transition, every agent invocation, every external API call, and every decision engine evaluation. This trace provides a complete, reproducible record of the execution.

The confirmation layer is not optional. The state machine does not transition to a terminal state until all confirmation operations have completed. If a confirmation operation fails (e.g., the SMS fails to send), the system retries with exponential backoff or escalates to an alternate channel.

8. Multi-Tenancy Architecture

Execution infrastructure is inherently multi-tenant. The architecture is designed to serve multiple independent businesses (tenants) on a shared execution engine. Each tenant receives:

Complete namespace isolation — Each tenant's data, configuration, and execution traces are isolated. No tenant can access another tenant's customer records, configuration, or execution history.
Independent configuration — Each tenant defines its own services, pricing rules, coverage areas, agent scripts, escalation policies, and notification preferences. These are stored as structured configuration data.
Shared execution engine — All tenants run on the same agent chain, the same state machine, the same decision engine, and the same integration adapters. The behavior differs because the configuration differs, not the code.
Per-tenant integration credentials — Each tenant provides its own API keys and credentials for external systems. A tenant's Stripe account, Twilio number, and CRM instance are configured independently.

Adding a new tenant requires creating a configuration file. The configuration specifies the tenant's services, pricing, coverage areas, integration credentials, notification preferences, and any custom agent parameters. No code is deployed. No infrastructure is provisioned. The execution engine reads the new configuration and begins serving the tenant immediately.

This is the same architectural pattern used by cloud infrastructure providers: the platform is shared, the configuration is isolated, and each tenant experiences the system as though it were purpose-built for their business.

9. Execution Trace — A Complete Example

The following trace illustrates a complete execution from interaction receipt to confirmation. Each step shows the layer and agent involved.

Interaction received

Customer calls +1-972-XXX-XXXX at 2:14 PM CST. The interaction layer receives the inbound call via Twilio, initiates the Deepgram speech-to-text stream, and constructs the execution context: channel: voice, tenant_id: vossome, caller_id: +1-555-XXX-XXXX.

Router activates

The state machine enters call_start and immediately transitions to greeting. The router loads the tenant configuration for vossome and selects the appropriate agent chain.

Greeting agent executes

The greeting agent introduces the service, identifies the caller intent from the initial transcription: "I need gutter cleaning." The intent signal is updated to service_request. Caller ID is checked against the CRM — no existing record found.

Service detection

The service detection agent classifies the request against the tenant's service catalog. The unstructured input "gutter cleaning" is mapped to the service identifier gutter_cleaning. The execution context is updated with service_type: gutter_cleaning.

Coverage validation

The city gate agent asks for the caller's ZIP code and checks it against the tenant's defined service areas. ZIP code 75001 is within the Dallas-Fort Worth coverage zone. Decision engine returns coverage: approved. State machine transitions to quoting.

Quote generation

The quote agent reads the pricing configuration for gutter_cleaning. Pricing rules specify: single-story residence $149, two-story residence $249. The agent asks the caller for property type, receives "two-story," and generates the quote: $249. The execution context is updated with quoted_price: 249.

Contact capture

The contact capture agent collects the caller's name, confirms their phone number, and requests their street address. All fields are validated: phone number is normalized to E.164 format, ZIP code matches the previously validated coverage area. A duplicate check against the CRM confirms this is a new customer.

Booking

The closer agent queries the scheduling adapter for available time slots. The next available slot is Thursday at 10:00 AM. The caller confirms. The scheduling adapter creates the appointment. The execution context is updated with appointment_id and appointment_time.

Dispatch

The dispatch agent sends a Telegram notification to the operations team containing: customer name, phone number, address, service type (gutter cleaning, two-story), quoted price ($249), appointment time (Thursday 10:00 AM), and a link to the full CRM record. The notification includes all information needed to prepare for the job.

Customer confirmation

The confirmation layer sends an SMS to the customer: "Your gutter cleaning appointment is confirmed for Thursday at 10:00 AM. Quoted price: $249. Reply HELP for assistance." The verbal confirmation is also delivered before the call ends.

CRM record creation

The CRM adapter creates a new customer record containing: contact information, interaction transcript, service requested, quote provided, appointment details, and disposition (booked). The record is tagged with the tenant ID for namespace isolation.

Execution complete

The state machine transitions to terminal. Total elapsed time: 3 minutes 22 seconds. Agents invoked: 7. External systems accessed: 7 (Twilio, Deepgram, ElevenLabs, Firestore, Google Calendar, Telegram, Twilio SMS). Decision engine evaluations: 4. All confirmation operations completed successfully.

This trace demonstrates the key property of execution infrastructure: a single interaction (one phone call) results in coordinated action across seven external systems, with every step governed by deterministic state transitions and tenant-specific configuration. No human intervention was required at any point in the execution.

Proxis is one implementation of this architectural pattern, currently deployed in production environments for service businesses.