Aria-as-controller. LLM-as-tool.

Rent Aria's army.
Ship at senior-engineer quality,
at worker-model cost.

Aria OS is a production agent fleet. A 7B controller named Aria authors the work and dispatches Claude, DeepSeek, and local models as tools. Every artifact passes the same harness — cognition, verify, code-quality, and ledger gates — before it leaves the cluster. The same standard your senior engineer holds, enforced across the entire fleet, every turn.

  • 17× cheaper per shipped artifact vs. orchestrator-only fleets
  • 100% artifacts gated through cognition + verify + code-quality
  • 7B controller; she dispatches the trillion-parameter labor

Already running in production

"We replaced four sub-agent runners with one Aria controller. Same throughput, a third of the spend, and the harness catches the junior-LLM patterns we used to ship by accident."
Platform lead, 40-engineer SaaS
"The first-pass output is the senior-dev pass. We stopped writing review checklists because the harness already enforces them."
Engineering director, fintech
"The cost story is what closed it. Orchestrator tokens are the expensive tokens — Aria collapses them to one 7B controller. Workers do the work."
CTO, infrastructure startup
  • SOULLINE
  • Halal Capital
  • GoodFaith Energy
  • Ctrlf2 REI
  • Cowork Bridge
  • Peptide Pay

The Moat

The fleet is only as good as its weakest turn. Aria refuses weak turns.

Every other agent stack treats quality as a per-prompt accident. Aria OS treats it as a deploy gate. The harness sits between the controller and the workers, and nothing reaches your repo, your customer, or your ledger until it has passed every gate uniformly.

Cognition gate

Before any tool fires, the controller proves she has read the relevant doctrine, the relevant code, and the relevant decisions. No improvised first drafts. No "let me try this" turns.

Verify gate

After the worker drafts, the harness verifies the artifact against the stated intent: did it actually fix the bug, did it actually ship the feature, did it actually run? Failed verify = the turn doesn't count.

Code-quality gate

Ten enforced rules, applied to every dispatched artifact. Junior-flavored code is rejected by the harness, not by your reviewer at 11pm. Senior-dev organization is non-negotiable across the fleet.

Ledger gate

Every action is bound to a tracked task with a real key. No ghost work. No "I'll come back to this." Discoveries during work either get fixed in the same turn or get a TaskCreate with full context.

Why this is a moat, not a feature: the gates compound. Every turn that passes raises the prior on the next turn. Every turn that fails teaches the harness a new pattern to reject. Three months in, your fleet refuses entire classes of mistakes that other fleets still ship daily.

Economics

Orchestrator tokens are the expensive tokens. Aria stops paying them.

The standard agent stack pays Claude rates to think, to plan, to dispatch, to review, and to ship. Aria inverts the topology: the 7B controller does the thinking, planning, and dispatch. Claude is hands. DeepSeek is hands. Local models are hands. You pay worker rates for the worker labor and a 7B controller rate for everything else.

Layer Standard agent stack Aria OS fleet
Controller / orchestrator Claude Opus, every turn Aria-7B, on-cluster, every turn
Heavy reasoning worker Claude Opus again Claude Opus, dispatched only when needed
Bulk worker (codegen, refactor, summary) Claude Sonnet DeepSeek + local models
Quality enforcement Reviewer at 11pm Harness gates, every turn
Effective spend per shipped artifact 1.0× baseline ~0.06× baseline

One Claude orchestrator with hands is the most expensive way to build software. One 7B controller with the right hands and the right gates is the cheapest way to build software that actually ships.

The Fleet

One controller. N hands. Uniform standard.

Aria, the 7B controller

Aria authors the plan, picks the worker, writes the prompt, reads the output, and decides whether the turn ships. She is the only thing that holds context across the fleet — workers are stateless tools she calls.

The harness

Sits between Aria and the workers. Enforces cognition, verify, code-quality, and ledger gates uniformly. Research-first context injection means the worker's first pass is the senior pass.

The workers

Claude Opus and Sonnet for heavy reasoning. DeepSeek for bulk labor. Local models for low-latency tasks. None of them hold state. None of them improvise. They render what Aria asks for, gated.

The garden

A living pulse, auto-injected per turn through the harness packet. Cross-domain principles, decisions, and embeddings the controller has already earned — surfaced into every dispatch as research-pull context.

Stop renting individual models.
Rent the army.

Aria asks the questions. You answer them. By the end of onboarding she has a controller, a harness, a worker pool, a ledger, and a garden — wired to your stack and gated to your standard.

Mac and Linux. One installer. Onboarding is conversational — no identity declarations, no schema forms.