Public · Technical Marketing

Architecture. The AST compiler that ate the SWE-agent.

The Theory of Complete Compute (TOCC) — Atomadic's compiler core — replaces text-generation with structured data-flow compilation. Five mathematically-verified gates, collision-free Leech-lattice routing, deterministic emission, hash-chained lineage. This is the architecture procurement teams build a contract on.

● report:tocc:trust_and_integrity_gates_v1 ● report:tocc:comparative_analysis_v1 ● compiler_hardening_road_map updated · 2026-06-12

Contents

§1 The paradigm shift
§2 SOTA comparative matrix
§3 The five verification gates
§4 Lineage & antibody quarantine
§5 Current benchmark receipt
§6 Hardening roadmap

§ 1The paradigm shift

Modern SOTA software-engineering agents — Devin, SWE-agent, AutoGPT, typical multi-agent orchestrators — operate on a generative text-writing model: prompt → LLM → text → file → test → repair-loop. Slow, expensive, and probabilistically unreliable.

TOCC transitions the paradigm. Logic is specified as structured JSON-IR; the local AST emitter produces source code deterministically; an auto-gate verifies before the byte ever touches disk. The LLM never sees source. The artifact is a projection of a contract, not the output of a generator.

[SOTA SWE-Agent Flow]
  User Request -> LLM -> Text Generation (LoC) -> File Write -> Test Run -> Repair Loop
                                                                            (slow / expensive)

[TOCC Compiler Flow]
  User Request -> LLM -> Logic Specification (JSON-IR) -> Local AST Emitter -> Compiled Source + Auto-gate
                                                                              (fast / cheap)

§ 2SOTA comparative matrix

The vectors a procurement team will compare, side-by-side:

Architectural vectorCurrent SOTA SWE-agentsTOCC AST-compiler engine
Generation mediumHeuristic text streams (O(N) LoC)Declarative AST JSON-IR mappings
Token cost5M – 40M tokens for larger sweeps~12,000 tokens / 200+ files
Generation speedSlow — bound by LLM output (hours)Compiled locally in seconds
Syntactic reliability1.5–3% syntax / import drift0% syntax drift — AST-verified before write
Namespace routingVector-DB heuristics or grepStrict O(1) CNAE namespace addressing
Validation gateHeuristic test-output parsingStrict static name scans + import verification

§ 3The five verification gates

Every candidate atom passes five gates before a single byte hits disk. Failure at any gate routes the candidate to the heal lane, not to the codebase.

Gate 1

CNAE Grammar & Verb Classifier · classify_name_static

The proposed name must validate against the frozen glossary: 54 frozen actions × 12 frozen scopes. Verbal invariant + scope invariant. Any ungrammatical name (e.g. organize_gmail_inbox) is immediately blocked at the boundary.

Gate 2

Structural Composition Invariant Scan · assess_functional_py_pure

Pure compositional tiers (T0–T2) are mathematically prohibited from if, for, while, or try nodes. Complex logic resolves exclusively as flat compositions over the locked 9-Atom Combinator Seed. Stateless deterministic verification at compile time.

Gate 3

Isolated Subprocess Import Gate · enforce_executability_stateful

Newly written code is imported inside an isolated subprocess with a strict execution timeout (5.0s). Catches syntax errors, unresolved sibling imports, and call-time NameErrors without executing the function body. Zero hang vulnerabilities.

Gate 4

Dual-Hash Cryptographic Seal

SHA256(AST_IR_Structure) binds the physical bytes to the contract structure. SHA256(CNAE + Contract + Semantics + Composition) binds the function's identity to its semantics. Any unauthorized manual edit triggers parity failure and locks the gate. High-security promotion gates additionally use ML-DSA (FIPS 204) post-quantum signatures.

Gate 5

Leech Lattice Coordinate Seal · Λ24

All 2,350+ tools are placed as coordinates in the optimal 24-dimensional Leech lattice using Golay error-correction. Every computational asset has an exact address. The result: collision-free O(1) nearest-neighbor tool discovery without loading AST trees into runtime memory.

EMIT

Emitted production atom

Block written, lineage stamped, signed receipt appended to the hash-chained ledger. The audit isn't an afterthought; it's the protocol.

§ 4Lineage & antibody quarantine

Every emitted block carries an immutable lineage record. The previous_version_id field traces the direct evolutionary chain — auditors can replay it backward; engineers can prove provenance forward.

If an atom fails validation during an active flywheel run, the system does not crash. The record_synthesis_replaced ledger quarantines the block, logs the validation defect, and signals the compiler's heal lane to regenerate from contract. The antibody pattern: every failure makes the engine stronger, never weaker.

This shifts verification from runtime integration testing to compile-time AST invariants. It makes it structurally impossible for syntactically broken, unauthenticated, or drifting code to exist within the system.

§ 5Current benchmark receipt

A sandbox-gated parallel-emit sweep, executed under the new isolated-subprocess discipline (Gate 3). Numbers below are the actual run, not a slide.

Sovereign flywheel hardening sweep

receipt:tocc:workspace_verification_v7 · 2026-06-12T07:25:00Z · generated_from_dataflow
total candidates scanned200
skipped (pre-existing on disk)171
compiled & emitted in parallel29
wired successfully27 pipelines
failed / compile-gapped2 pipelines
total elapsed59.28 seconds
throughput29.4 files / min (2.5× SOTA)
total tokens consumed~12,000 (99.96% vs traditional swarm)
syntactic drift0%
hang vulnerabilities0 — subprocess sandbox timeout 5.0s
validation invariant100% passed

Comparable traditional multi-agent swarm: ~35M tokens, ~18 hours, ~11.5 files/min, high syntax-drift risk.

§ 6Hardening roadmap

The architectural evolution to scale the engine to multi-agent swarm orchestration and absolute execution safety. Each step compounds the dominance; none weakens it.

Step 1 · Active

Abstract Syntax Graph (ASG) transformation

Move from hierarchical AST to ASG — capture definition-use chains and control-flow edges natively in JSON-IR. New pass analyze_semantic_relations_pure builds the graph; gaps caught pre-render.

Step 2 · In flight

DAG-based orchestrator (LLMCompiler pattern)

Transition pipeline synthesis to a Directed Acyclic Graph manager. Parallel task dispatching for independent nodes; map-reduce join layer for aggregating sub-component outputs before piping downstream.

Step 3 · Active

Strict sandbox execution gating (zero-trust)

Decouple compiler verification from the host machine. Containerized imports in ephemeral Docker sandboxes; strict RAM and CPU quotas per verify run. Already live for Gate 3 via subprocess timeout.

Step 4 · Active

Context engineering & codemapping

Active in-memory dependency graph of all 27,000+ files for instant circular-import resolution. Change-propagation auditing proactively flags downstream modules requiring re-verification when a lower-tier primitive changes.


Pair this page with the Trust Center for the live engine receipt bus, the Compliance Center for the regulatory mapping (EU AI Act / NIST AI RMF / EO 14028), and the Proof page for the Lean theorem ledger. Verify yourself with the omega-verification-kit.