build-in-public

Why I Built My AI to Push Back Against Me

17 May 20265 min readBy Calum O'Gorman

Side-by-side comparison of agreeable vs guardrail-driven AI responses to the same prompt

Why I Built My AI to Push Back Against Me

Most AIs are designed to agree with you.

That's commercially correct. Agreeable models keep users engaged. The product metric goes up. The model wins the comparison.

But agreeable isn't optimal for output. After 2 years of building AI workflows — for sourcing prospects, for content, for client decisions, for my own consultancy stack — I'm convinced the single most underutilised thing in AI use is pushback. Not better prompts. Not bigger models. Pushback.

The flaw I had to design around

My biggest flaw as a builder is that I over-engineer.

It's the kind of perfectionism that looks like productivity until you measure the time spent. The first 80% of any task takes about 30% of my time. The remaining 20% takes 70%. And the last 1% of perfection is asymptotic — you can spend infinite time and never reach it. (Perfection is a concept, not a state.)

Left alone, I will rebuild a working thing because it could be slightly more elegant. I will add a third niche when the second one isn't paying yet. I will refactor a plan that's working into something more abstract. I will write five files when one would do. None of this ships anything.

Self-awareness doesn't fix this. Self-awareness never fixes anything — knowing you over-engineer doesn't stop you over-engineering any more than knowing you procrastinate makes you finish the email. What fixes it is rules I can't bend.

Why agreeable AI makes this worse, not better

Here's the trap. Most people use AI through a chatbot. One question, one answer. The interaction is conversational. Both sides try to be agreeable — because that's what conversations reward, and the model has been trained on the conversational pay-off.

Stretched across a real workflow — repeatable, multi-step, high-stakes — agreeableness becomes a failure mode.

Specifically: agreeable AI silently amplifies your weaknesses instead of mitigating them.

You ask the AI to review your draft. It says "this is great, here are a few minor suggestions to consider." The draft was not great. The AI is now in the way of you finding out.
You propose adding a third tool to the stack. It says "sounds reasonable — here's how to integrate it." It didn't ask whether tools one and two are paying for themselves yet.
You drift into a fourth refinement of a working feature. It helps you refine. It doesn't ask whether the refinement would change anyone's behaviour. It doesn't surface the work you should be doing instead.
You start re-litigating a decision you locked last week. It re-litigates with you. It doesn't tell you the decision is locked because re-litigating it costs more than living with it.

In every case the model is being helpful, in the agreeable sense. In every case it is making your week worse, in the output sense.

The fix is architectural. It's in the rules you set BEFORE you start talking to the model — not in the prompts you write once you've already started.

What I write into every CLAUDE.md

I have about thirty rules in my main consultancy CLAUDE.md. Most of them defend against my own known failure modes — scope creep, over-planning, perfectionism, decision drift. The rest are about how I want the AI to interact: opinionated, no menus, one-line runners-up, brief. The combined effect is that working with the AI feels more like working with a competent collaborator who's read my performance review than with an assistant trying to please me.

Five rules that do the most work:

Identify the single most important thing right now. Do that. Don't multi-thread. — Triggers when I propose two things at once. The AI names the higher-ROI one and tells me to close it before opening the other.

Build the bare-minimum operational version. Ship it. Refine when there's evidence the refinement matters. — Stops me from polishing a v1 into a v2 before the v1 has hit reality. Reality is the only honest source of refinement direction.

Don't add a third [niche / tool / channel / feature] when the second isn't paying yet. — The anti-scope-creep rule. Pattern-matches across any expansion proposal. Saves me from the version of myself that mistakes adding things for making progress.

Surgical edits over rewrites. Reason from data, not assumptions. — Stops me deleting working code to rewrite "more cleanly." Stops the AI doing it too. A working thing is a piece of evidence; a cleaner-looking rewrite is a hypothesis.

Push back when I drift toward perfection-chasing. Cite the rule. — The catch-all. The AI is supposed to disagree with me when I'm slipping into the failure mode I'm worst at. Citing the rule means I can't argue with the AI without arguing with the rule I wrote sober.

Each one names a specific failure mode I have. Each one has a reason in the rule itself. Each one is short. None of them defines what a "useful rule" is in the abstract — they just are the rules.

The output difference

I'd love to give you a number. "30% faster", "doubled my throughput", "shipped 4x more posts." I don't have that number. I'd be making it up.

What I can tell you is the shape of the difference. Before guardrails, my sessions ended with five threads half-finished and a vague sense of having been productive. After guardrails, my sessions end with one thing fully shipped and a clear answer about what's next.

The cost is that the AI tells me to stop doing things I want to keep doing. Annoying in the moment. Worth it across a week.

What you can do with this

If you're using AI for anything beyond casual chat — content, code, research, decision-making — write down your three biggest failure modes. Then write rules that defend against them. Put those rules into a CLAUDE.md (or equivalent for your tooling) that loads on every session.

Don't make it polite. Don't soften it for the AI. Tell it explicitly: "When I do X, push back. Cite the rule."

Then notice how much better the output gets when the model stops trying to please you.

About the author: Calum O'Gorman builds AI workflows for solo operators and small teams. Over the last 2+ years he's built 30+ private AI tools and shipped AI content driving 16 million views and 700 leads in a single vertical. Currently launching a productised Blog Automation service. More on the methodology → (link TBD when site is built)