AI Should Build Your Processes. Not Be Them

Your AI bill isn't a usage problem. It's an architecture problem. And the fix is the oldest idea in software.

A pattern I keep seeing: AI bills creeping toward $400 a month on workflows that could cost $20.

The instinct makes sense. AI feels magical, so we reach for it everywhere.

Multi-agent swarms. Seventeen-MCP orchestrations. An LLM that calls another LLM that calls a third one just to hang out.

The result: systems that are expensive to run and hard to trust.

But there's a simpler move...just write better software.

Pipes vs brains

If you write a giant prompt and plop it into a model, you could have the same input, same model, same temperature, but when you run it twice you get different answers.

The real skill in 2026 isn't prompting better. It's knowing what not to prompt.

The fix is deceptively simple in concept, and surprisingly hard in practice. Stop collapsing everything into one giant LLM call. Split the work into two layers.

  • Pipes are plumbing (use code). Structured, predictable, repeatable. Data in, data out, same shape every time. APIs, queries, scripts, scheduled jobs. Boring stuff that runs forever and costs basically nothing.

  • Brains are judgment (use a model). Synthesis, drafting, pattern spotting, tradeoff calls. This is where an LLM actually earns its keep. And when you're here, pick the right-sized model for the job. Haiku for cheap, Opus for heavy lifting, Sonnet for the middle.

Most "AI workflows" are 80% pipe work pretending to be brain work.

It's like asking a Michelin chef (LLM model) to also run the dishwasher, stock the pantry, and take reservations. They can. But that's not what you're paying for, and they're going to burn the entrée.

The boring rules that didn't go away

Most AI bills aren't exploding because AI is expensive. They grow because AI makes it easy to skip rules engineers had to learn the hard way.

You can prompt past most of these for a while. Then the seams start to show: the data diverges, the pipeline quietly fails, the same call triggers six times and nobody can figure out why.

Here are some software principles to use when building things with AI.

Software principle

What it actually means

Where AI breaks it

Single Responsibility

One job per step. If a thing is doing four, you can't tell which one broke.

One giant prompt that fetches, filters, analyzes, and drafts. When the output is wrong, you can't tell which step failed.

YAGNI ("You aren't gonna need it")

Build for the problem you have today. Not the one you might have someday.

Asking an LLM to "handle any report type" when you actually need one. You've built a fragile everything-machine instead of a working something.

Idempotency

Running it twice shouldn't break anything. Sending the same email twice should be safe.

Triggering the pipeline twice and getting two invoices sent, two Slack pings, two different answers.

Single Source of Truth

Every important rule lives in exactly one place. If it lives in three, one is already wrong.

"Qualified lead" defined in the CRM, in the qualifier prompt, and in the sales manager's head. They disagree. Nobody knows which is right.

DRY ("Don't Repeat Yourself")

Solve it once, reuse it. Copies drift.

The same extraction logic copy-pasted into five different prompts. You fix one. The other four keep breaking.

Observability

You should find out when it breaks. Silent failures are how dashboards go empty.

An LLM returning nonsense for two weeks and everyone assuming it's fine because no error got thrown.

These aren't programming rules so much as reliability rules. They apply whether the thing doing the work is code, a contractor, or an LLM.

Follow the rules and AI stays useful. Skip them and you've built a fragile, expensive system that works most of the time, which is the worst kind of system to have.

How I cut my AI bill in half

I had a $200 Claude subscription for months. I was, to put it kindly, abusing it with my OpenClaw.

Every task went through the same workflow: stuff a huge prompt into a recurring process that lived in my HEARTBEAT.md file, pray it works reliably, then troubleshoot it for longer than just doing the workflow myself.

Then Anthropic changed how subscriptions worked with OpenClaw (blessing in disguise), and I rebuilt my most-used workflows around the pipes-feed-the-brain pattern.

Now I have all my key workflows working more reliably and better for $15/month.

Take my marathon bot. Old version: one giant prompt asking Claude to "help me plan my daily workouts based on my marathon goal time and schedule my daily workouts." Output: inconsistent, sometimes great, sometimes confusing.

New version:

  • Pipes pull structured data. 

    • Input: Query 8sleep for recovery data, Strava for workout data, meal log, local weather and calendar for availability. Normalize.

    • Output: Performance, fatigue, and when to schedule meals, PT, and workouts.

  • Brain does the reasoning.

    • Input: Data from pipes, coaching philosophy doc.

    • Questions it answers: “Should we adjust today’s workout based on previous performance and fatigue?”, “Do we need to introduce more strength training/PT?”, “Do we need to change diet based on workout performance”, “When should we schedule these workouts?”

My training is now more holistic, the bot is cheaper, and more reliable. All I had to ask is “is this step a lookup or a judgment call?”.

The audit

Run this on one workflow this week.

  • Step 1: Pick your most-used AI workflow (or the one that keeps breaking).

  • Step 2: Write down every step the LLM is doing. All of it. Fetching data, formatting, looking things up, making judgments, drafting replies.

  • Step 3: Circle the steps where the same input would always produce the same output. Fetching a calendar event. Pulling a Stripe invoice. Parsing a CSV. Formatting a table. These are pipes pretending to be brains. You're paying reasoning prices for work that a 20-line script would do for free, forever.

  • Step 4: For what's left, right-size the model. Does this step actually need Opus, or would Haiku crush it at a fraction of the cost?

Run this on one workflow. Then the next. It compounds fast.

Build resilient, not complex

The flex in 2026 isn't having the most complex AI stack. It's having the one that actually works when you're not watching it.

Pipes give you predictability. Brains give you leverage. It's easy to get this backwards. I did, for months, which is how bills end up at $200+ and workflows still break.

The same pattern shows up past your AI stack. In teams, calendars, hiring processes, weekly closes. Asking for judgment where a system would do, running systems where judgment is needed.

Figuring out which is which is most of the work.

Interested in some hands-on help? I’m taking on a small number of consulting, advising, and executive coaching clients each quarter. Reply to this email or hit me up.