One brain-trust session, distilled to the parts that travel: the app is becoming a surface an agent paints on, the words "agent," "harness," and "loop" finally have edges, the 2026 toolchain is a pipeline not a chat box, and the model race is bending away from raw coding toward intelligence and cost.
A through-line ran under the whole conversation: stop shipping apps, start shipping agents — and the craft is in the loop, the harness, and the surfaces, not the chat window.
The cleanest reframe of the night: the agent lives behind the app, and the app is just a surface it projects onto — web, mobile, WhatsApp, chat, a browser extension. The agent paints the surface and refreshes it on a schedule or on demand. It's the early-app-store moment again, but for agentic software.
Because the agent owns the output, it can adapt to who's reading: dense signal for an expert, an auto-added explainer layer for a novice. The open-source fire-tracking agent is the worked example — it polls its sources every two hours (or on a manual refresh) and renders a live widget: mile-radius rings, fire locations, wind direction and strength. One agent, one canvas it keeps repainting.
The words get used interchangeably; they shouldn't. The stack, bottom to top:
Claude Code, Codex, OpenClaw, PI are harnesses. OpenClaw on a Mac Mini is an agent; OpenClaw alone is just a harness. Black-box harnesses hide their system prompts — power and a sharp edge both.
Worth separating: a cadence loop runs continuously against a metric, improving by increments — Kaizen for software (a check-in agent every few hours). A build loop aims each pass at ~80% of the target and iterates to done without micromanaging. Different tools for different jobs.
Two more ideas with legs. Agent "telepathy": agents share state by reading each other's HTML wiki / filesystem directly — no second model call to ask the other agent. Faster, cheaper. And auto-dream: a nightly pass that reviews the day's session logs and indexes them into a memory tree — agents that "dream" into HTML wikis navigable by both machines and humans. (This very page is one of those.)
The harness stopped being a chat box and became a pipeline. Claude Code as the conductor, running Opus 4 (strategy, design, orchestration) and GPT-5.5 (terse, strict coding) in parallel. Around it:
An abstraction over git worktrees — no manual git, ever.
Design → PR → deploy, end to end. Shows output as HTML to you-as-PM, not raw code; opens and merges PRs itself.
A design-system generator that breaks the AI-slop look and pushes toward Apple-tier output.
The "senior engineer with a ponytail" — picks the right libraries, refuses to over-build, cuts token use.
At session end the agent reviews its own process and PRs a workflow improvement back for approval.
Shows the current pipeline stage; a hand icon signals exactly when you're needed.
The read on the field, and the bet underneath it:
| Model | Shape | Use |
|---|---|---|
| GPT-5.5 | strong, terse coder | pure coding tasks |
| Opus 4 | conversational, strategic | design, orchestration |
| GPT-5.6 upcoming | three sizes — Soul / Tara / Luna; ~⅓ the tokens | Tara ≈ Opus 4 at ~half the price |
| GLM 5.2 | ≈ Opus 4 on benchmarks, 6–10× cheaper | not yet a daily coding driver |
| Fusion untested | reportedly Fable-level, on OpenRouter | tbd |
The substrate underneath all of it. The whole stack on Cloudflare — instantly scalable, best pricing, with $10k in startup credits available through their program. Flu, the open-source agent framework, is built on Cloudflare's abstractions (agents as Durable Objects, close to the user). The Astro team's new framework maps cleanly onto the same edge — a recent switch.
One project carried the thesis into product. Home Zero repositioned away from "the next Notion for real estate" toward AI-native home search, aimed at "Claude Dads (and moms)" — technical users already living in Claude Code, who'll operate it from their own session and bring their own API keys. Three agents do the work:
The advisor queries the market and home agents on the buyer's behalf, then paints the buyer's surfaces. Mission: a public-benefit company driving housing costs toward zero.
Nick's build, the most autonomous in the room: an AI-native firm trading real capital on Coinbase, by itself. The architecture centers on a single approved actor — "The Cleaner" — the only agent allowed to execute live trades. It catches signals 24/7, which is the entire point: it solves the human sleep-and-distraction problem. A supporting cast feeds and fences it:
The approved systems agent — the only one that touches live money. Watches every signal, around the clock.
Scrapes forums for trading systems and back-tests them against historical data.
Parses each system into JSON and runs it through an analysis pipeline.
Governs the live-money gates before The Cleaner is allowed to act.
Allocates a daily token budget per agent; an emergency kill switch sits on the command center.
A GUI of toggle switches, agent status, and report generation.
It's designed to self-improve: trade three times a day live, drop into research mode overnight, review its own trades, and update its approach. The current human workflow is itself a two-model pipeline — ChatGPT Pro as project manager generating every Codex prompt (Nick never writes one directly), a fresh thread each day to fight context drift. Next phase: once the v1 package is complete, drop the entire codebase into GPT-5.6 for a rapid v2 refactor.
A quieter, human-in-the-loop cousin: an observational agent that watches price and the order book in real time, plus historical analysis. A working methodology under test is the "double-zero theory" — price held at an even dollar (e.g. $59,820.00) reads as a market-maker signaling continuation of downward momentum; the agent's job is to help develop and validate methods like it.
The real moat is data: order-book history that brokers don't collect or surface — no broker with hundreds of thousands of accounts provides it, an intentional omission. A secondary, much-later educational layer targets users who already have financial literacy (candles, averages, basic instruments); the recommended onboarding is babypips.com — preschool through middle school only — deliberately avoiding the indicator-stacking that doesn't hold up over time. Philosophy throughout: augment the trader's judgment, don't replace it. Infra note: currently on AWS, where it hit billing overages (thresholds and alerts now in place) — Pete's suggestion was to migrate to Cloudflare to kill the surprise costs.
The same person, a very different arena: Pete's civic work in the Town of Paonia — a short-term-rental ordinance, a string of open-records disputes, and a mayoral recall. It opens on a number nobody agrees on. Former administrator Stefan Wynn's public claim of 72 active STRs anchors the ordinance — and three independent counts contradict it:
The load-bearing escalation. In March 2026, Wynn — then administrator — filed police reports characterizing residents' protected speech as threats, naming Pete (passing out flyers) and Kaja (a community Facebook post).
The records, obtained under CORA, show Wynn forwarded the emails to his attorney and then to police; the released PDFs carried improperly applied redactions (black boxes that lifted off to reveal the text underneath). The attorney correspondence suggests he sought a cease-and-desist against the flyers and named Kaja an "accomplice" — though she distributed none. Clerk Samira added two back-dated personnel incidents to Kaja's record the same day Kaja went to the mayor. Pete's framing is structural: a board-oversight failure, not just a Wynn problem — and his ask is narrow: one special meeting, on the record, with Kaja and Zane cleared by name.
With the administrator seat vacant, Pete argues it's the moment to restructure: split the administrator role (only 25% town-funded; 75% rides on utilities) and separate the treasurer; replace the ~$1,000/meeting town attorney with a trained $40/hr staffer; CORA-request and AI-diff the $10k Indiana HR-manual contract (bid at $7,500 to duck the discretionary cap) against the free one he already published; drop parking minimums as Mancos did. Running underneath: a decade-long cycle of administrator trouble, including a prior $400k embezzlement.