LLM Agents That Rewrite Python Code Collaboratively — And Orchestrate It at Runtime

Every LLM orchestration framework I tried had the same silent assumption baked in: agents call LLMs at runtime. Forever.

You build a pipeline. Users hit it. Your token bill compounds with every request. You optimize — caching here, shorter prompts there — but you're fighting gravity. The architecture itself assumes infinite AI budget.

After hitting this wall on three different projects, I stopped trying to make the existing frameworks cheaper and started asking a different question:

What if the agents' job wasn't to perform tasks at runtime — but to improve the code that performs tasks, once, at build time?

That's the insight behind GraphBus. This post walks through the architecture, why it works, and exactly how it differs from LangGraph, CrewAI, and AutoGen.

The core idea: build once, run forever

GraphBus has two strictly separated execution modes:

Build Mode — your Python classes become LLM-powered agents. Each one reads its own source code, proposes improvements, and negotiates with other agents via a typed message bus. An arbiter resolves conflicts. When consensus is reached, the agreed changes are committed back to source files. The build emits JSON artifacts describing the final agent graph.

Runtime Mode — the JSON artifacts load and your agents run on the GraphBus message bus. Call LLMs inside agent methods when your logic needs it, or skip them entirely. You have full control.

Build once  (agents active, code mutable, LLMs negotiating)
     ↓
  .graphbus/  (graph.json, agents.json, topics.json)
     ↓
Deploy and run  (agents on the bus, LLMs on your terms)

The key inversion: instead of the LLM being a runtime dependency, it's a build-time tool. The intelligence is spent once improving the code, then the improved code runs cheaply at scale.

What it looks like in code

A GraphBus agent is a Python class:

from graphbus_core import GraphBusNode, schema_method, subscribe

class OrderProcessor(GraphBusNode):
    SYSTEM_PROMPT = """
    I process e-commerce orders. I validate inputs, check inventory,
    and emit order events. During build cycles, I negotiate with other
    agents to ensure our shared schemas are consistent and well-typed.
    """

    @schema_method(
        input_schema={"order_id": str, "items": list},
        output_schema={"status": str, "total": float, "event_id": str}
    )
    def process_order(self, order_id: str, items: list) -> dict:
        total = sum(item["price"] * item["qty"] for item in items)
        return {
            "status": "confirmed",
            "total": total,
            "event_id": f"ord_{order_id}"
        }

    @subscribe("/Inventory/Updated")
    def on_inventory_update(self, event):
        self.log(f"Inventory changed: {event.payload}")

Three things to notice: (1) the SYSTEM_PROMPT describes the agent's role and negotiation strategy — this is only used during build; (2) @schema_method declares a typed contract between agents; (3) @subscribe registers a pub/sub handler on the message bus.

Now build it:

pip install graphbus
graphbus init my-shop && cd my-shop

# Static build — no LLM, just graph analysis (~100ms)
graphbus build agents/

# LLM build — agents negotiate and improve the code
export ANTHROPIC_API_KEY=sk-ant-...
graphbus build agents/ --enable-agents

During an LLM build, you see the negotiation live:

[AGENT] OrderProcessor: "I propose adding input validation for negative prices"
         rationale: Prevents downstream NaN propagation in total calculation
         affects: InventoryService, NotificationService

[AGENT] InventoryService:    "Accepted — reduces error handling in my handler"
[AGENT] NotificationService: "Accepted — no schema impact on my side"

[ARBITER] Consensus (2/2). Committing.
  ✓ Modified: agents/order_processor.py

✓ Build complete — 1 file improved by agents
✓ Artifacts written to: .graphbus/

Then runtime:

graphbus run .graphbus/
# [RUNTIME] Loaded 3 agents, 4 topics, 3 subscriptions
# [RUNTIME] Ready. Zero LLM calls will be made during execution.

The improved validation code is now compiled into the static artifacts. Every future call to process_order runs the validated, agent-improved version — no LLM involvement.

The architecture in depth

Layer 1: The Graph

Every agent class becomes a node in a networkx directed acyclic graph. Dependencies between agents (@depends_on) become edges. The build uses topological sort to determine which agents should be evaluated first — ensuring that an agent proposing changes to a shared schema is evaluated before the agents that depend on it.

This graph structure is what makes the negotiation coherent. Agents don't just propose random changes; they reason about their position in the dependency graph and how their proposals affect downstream agents.

Layer 2: The Message Bus

The bus is a typed pub/sub system. Topics are path-like strings: /Order/Created, /Inventory/Updated, /System/Ready. Any agent can publish; any agent can subscribe. The bus handles routing; your business logic handles the rest.

Critically, the bus serves both modes. In Build Mode, it carries proposals, evaluations, and arbiter decisions. In Runtime Mode, the same bus carries your application events — but no agent intelligence is involved. It's just fast in-process pub/sub.

Layer 3: The Negotiation Protocol

The negotiation cycle has five steps (for a complete technical deep-dive, see The Negotiation Protocol: How GraphBus Agents Reach Consensus):

Activate — one LLM instance per agent class, initialized with its source code and system prompt
Propose — each agent reads its source and emits structured proposals (diff + rationale + affected agents)
Evaluate — agents in the affected list evaluate each proposal (accept/reject + reasoning)
Arbitrate — if a vote is split, the designated arbiter agent makes the final call
Commit — accepted proposals are applied to the source files before artifacts are emitted

Proposals are strongly typed:

class Proposal:
    agent_id: str        # who's proposing
    target_file: str     # which file to change
    diff: str            # unified diff
    rationale: str       # LLM's stated reasoning
    affects: list[str]   # other agents whose schemas are impacted

class Evaluation:
    agent_id: str
    proposal_id: str
    decision: str        # "accept" | "reject"
    reasoning: str
    counter_proposal: str | None  # alternative diff if rejecting

The whole history is persisted and inspectable:

graphbus inspect-negotiation
# Browse: every proposal, every evaluation, every arbitration, every commit

The Arbiter pattern

Split votes are inevitable when multiple agents have conflicting interests. GraphBus handles this with a designated arbiter agent — a special class whose job is to make final decisions.

class ArbiterService(GraphBusNode):
    IS_ARBITER = True

    SYSTEM_PROMPT = """
    You are an impartial arbiter. When agents disagree on a proposal:
    - Favor changes that improve correctness without breaking existing contracts
    - Reject changes that introduce risk without clear benefit
    - Be conservative: when in doubt, reject
    - Always explain your reasoning
    """

The arbiter can also be used proactively: run graphbus negotiate agents/ --rounds 3 to execute multiple negotiation rounds without a full build, exploring more of the improvement space.

Why this is different from LangGraph, CrewAI, and AutoGen

I'll be direct about the comparison (for the full side-by-side breakdown with architecture diagrams and cost numbers, read GraphBus vs LangGraph vs CrewAI vs AutoGen: An Honest Comparison (2026)):

LangGraph is excellent for stateful agent workflows at runtime. Its graph routes between LLM calls. Every node in the graph is a function that calls an LLM. That's a great model if you need runtime adaptability — but you're paying per-call forever.

CrewAI focuses on role-based crews doing tasks. Agents collaborate at runtime with natural language. Clean developer experience, but the LLM meter is always running.

AutoGen is a conversational framework where agents chat to solve problems. Very flexible, also runtime-LLM-dependent.

GraphBus's graph orchestrates where code improvement proposals flow during build. At runtime, the graph routes typed events between agent methods — and those methods can call LLMs whenever your logic requires it.

The core difference: other frameworks run agents to perform tasks. GraphBus runs agents to improve the code that performs tasks. After a build cycle, the intelligence lives in static artifacts — not perpetually consumed at runtime.

GraphBus gives you the best of both worlds: agents that improve themselves intelligently at build time, and a structured message bus for runtime coordination. Your runtime agent methods can call LLMs on-demand for context-aware decisions — the bus handles routing, you handle the logic. LangGraph is a great choice if you need conversational, loop-based agent reasoning. GraphBus is the right architecture when you want structured inter-agent communication with graph-based orchestration.

The build pipeline, step by step

When you run graphbus build agents/ --enable-agents, this happens:

SCAN — discover all GraphBusNode subclasses in the target path
EXTRACT — parse methods, schemas, subscriptions, system prompts per class
BUILD_GRAPH — construct the networkx DAG from @depends_on edges
TOPOLOGICAL_SORT — determine evaluation order
ACTIVATE_AGENTS — instantiate one LLM per node (in topological order)
NEGOTIATION_LOOP — propose → evaluate → arbitrate (repeatable rounds)
APPLY_COMMITS — write accepted diffs to source files
EMIT_ARTIFACTS — write graph.json, agents.json, topics.json to .graphbus/

Steps 5–7 are skipped in a static build (graphbus build without --enable-agents). The static build runs in ~100ms and gives you the artifact structure without any LLM spend — useful for development and CI pipelines.

Practical deployment

Once you have artifacts in .graphbus/, deployment is straightforward. GraphBus ships CLI tools for the whole stack (see also: Build a Multi-Agent News Pipeline with GraphBus for a complete end-to-end example):

# Generate a Dockerfile
graphbus docker build
# → writes Dockerfile that copies .graphbus/ and starts the runtime

# Generate Kubernetes manifests
graphbus k8s generate --namespace production
# → writes deployment.yaml, service.yaml, configmap.yaml

# Generate GitHub Actions workflow
graphbus ci github
# → writes .github/workflows/graphbus.yml (build on PR, deploy on main)

# Check your schema contracts before deploying
graphbus contract agents/
# → validates every @schema_method edge in the graph

The runtime pod has no AI dependencies at all. It's a plain Python process loading JSON files — no Anthropic SDK, no OpenAI client, no network calls. Your AI budget is spent once during the build step in CI.

Current status and what's next

GraphBus is in alpha. The build pipeline, runtime engine, negotiation protocol, and CLI are working and tested (100% passing on build + runtime test suites). We're using it on real projects.

What's not done yet:

PyPI release (pip install graphbus) — coming soon
OpenAI and Ollama backends (currently Claude-native)
Visual graph editor
Cloud-hosted build service (so you don't need the CLI locally)

The protocol specification — a formal description of the proposal/evaluation/arbitration format that would allow non-Python implementations — is on the roadmap.

If you're building multi-agent systems in production and you've hit the token cost wall, I'd genuinely love to hear from you. The tradeoffs I made might be wrong for your use case — feedback from people with real production constraints is how we'll make GraphBus actually useful.

The code is MIT licensed. github.com/graphbus/graphbus-core

Try GraphBus in your next project

Alpha access is open. Join the waitlist and we'll reach out when we're ready to onboard you.

Join the waitlist Read the docs

#python #llm #multi-agent #architecture #open-source #mlops #graphbus