Runtime
A deterministic execution engine for AI workflows.
The AINL runtime, implemented in runtime/engine.py, takes compiled graphs and executes them with strict state discipline, capability grants, and clear audit trails. The runner service wraps this engine behind policy-gated APIs like /run and /capabilities.
- 1.GET /capabilities — discover adapters, verbs, privilege tiers
- 2.POST /run — submit AINL source or compiled IR
- 3.Policy gate — optional policy object validated before execution
- 4.RuntimeEngine — executes graph nodes deterministically
- 5.Record / replay — optional call logging for audit & debugging
Engine
RuntimeEngine: graph-first semantics.
The core runtime is the RuntimeEngine in runtime/engine.py. It owns step execution, state updates, adapter calls, and record/replay — separate from any particular server or UI.
Deterministic graph execution
The engine executes nodes in the compiled graph IR in a predictable order, with explicit jumps and control flow. Given the same graph, inputs, and adapter configuration, you get the same behavior every time.
Tiered state discipline
AINL distinguishes between in-graph variables, cache, persistent storage, and coordination state (queues/mailboxes). Adapters expose these tiers explicitly so you can reason about where your data lives.
Record and replay
The runner can optionally record adapter calls and results, then replay them against the same graph for debugging, audits, or regression tests — documented in the integration and architecture docs.
Runner service
/run and /capabilities, as real APIs.
The FastAPI runner in scripts/runtime_runner_service.py exposes the runtime over HTTP: synchronous execution with /run, queued workloads with /enqueue and /result, capability discovery with /capabilities, and health/metrics endpoints.
| Endpoint | Verb | Purpose |
|---|---|---|
| /capabilities | GET | Return supported adapters, verbs, effect defaults, privilege tiers, and policy support. |
| /run | POST | Compile (if needed), validate policy, execute a workflow synchronously, and return structured output. |
| /enqueue | POST | Submit a workflow for async execution; returns an ID for polling. |
| /result/{id} | GET | Fetch the result of an async run by ID. |
| /health /ready | GET | Simple liveness and readiness probes for orchestration and load-balancers. |
| /metrics | GET | Prometheus-style metrics for latency, errors, and adapter usage. |
Typical integration pattern
- Query
GET /capabilitiesto understand which adapters and privilege tiers a given runtime instance supports. - Compile or author an AINL program, then submit it to
POST /runwith optionalpolicyand adapter allowlist. - For higher-throughput or long-running workflows, use
/enqueueand poll/result.
Capability grants
Named security profiles at startup.
At startup, each runtime surface loads a server-level capability grant from a named security profile, such as local_minimal or sandbox_network_restricted. Requests can only tighten these rules, never widen them.
Server grants
The runner reads AINL_SECURITY_PROFILE at startup and loads a named profile from tooling/security_profiles.json using the capability grant model. This grant defines which adapters, effect tiers, and privilege tiers are even possible.
Restrictive-only merge
When callers attach a policy object to /run, the host merges that policy with the server grant using a restrictive-only rule: callers can forbid more adapters or tiers, but can never escape the server's base restrictions.
Privilege-aware capabilities
Adapter metadata (e.g. destructive, network_facing, sandbox_safe, privilege tier) is exposed via /capabilities. Orchestrators can construct policies like "forbid all destructive adapters" without hard-coding adapter names.
For full details, see docs/operations/CAPABILITY_GRANT_MODEL.md and docs/operations/SANDBOX_EXECUTION_PROFILE.md.
Runtime cost profile.
In AINL benchmarks, complex workflows compile once (tens of thousands of tokens) and then run at roughly fixed cost per execution. Case studies like HOW_AINL_SAVES_MONEY highlight scenarios where moving orchestration into the runtime produces order-of-magnitude savings compared to re-running prompt loops for every invocation.
Put simply: the runtime keeps your workflows cheap and predictable once they're compiled, while still enforcing strict capability and policy boundaries.
