AI Agents

How to Turn Your SOPs Into AI Agents with Claude Code

A CEO's playbook: build the first agent yourself, then hand the library to your team.

Everything your company already knows how to do is software waiting to be switched on. A detailed SOP is an agent you have not turned on yet. But here is what nobody tells you: those SOPs are also aging faster than any documentation you have ever owned, and most were written for a world that just ended.

Short answer: Yes, your SOPs can become AI agents, because an SOP already holds what an agent needs: a trigger, ordered steps, an owner, and a defined output. But you do not convert them as written. You vet them first, line by line, with three verdicts. A click ("go to the CRM and check the field") is obsolete; the agent absorbs it. A form ("fill out the status report") is obsolete in a worse way; it gets replaced with richer capture, usually voice. A judgment stays human. The rebuilt document is a Human-Agent SOP (HASOP): the agent's lane carries the motion, the human's lane keeps the decisions. And the rebuild never finishes. The half-life of a keystroke instruction used to be years. It is now weeks. The end state is a procedure that re-vets itself, what I call sentient ops, where the HASOP dance runs every few weeks instead of once a revision cycle.

Here is how this played out in a real session, this week, with a real CEO.

The session: proving it knows the business

I was helping the CEO of a process-improvement consulting firm set up Claude Code for the first time. We connected the systems his business runs on: CRM, email, calendar, meeting transcripts, e-signature, his files. Twenty minutes of logins. The boring part everyone wants to skip, and the part that decides everything after it.

Then we ran strategic prompts, the kind designed to prove whether it actually knows the business. We had it pull a proposal he wrote years ago for a manufacturing client without telling it where the file lived. It found it, named the framing line worth keeping verbatim, and caught a numbering mismatch that had sat in a client-facing document for a decade, along with the fact that the proposal listed every cost and never once projected the client's savings. His reaction, word for word: "Who wrote this thing? It's all me."

Then I told him to push it: what am I missing right now?

It found a revenue leak. A client had said yes. The deal was blocked on one revised document because the client's entity name changed. The revision had been written internally, reviewed internally, and never reached the client. Weeks of a closed deal sitting still, over admin. He had been angry about that deal the day before without knowing exactly where it was stuck. The agent found it in minutes, with the uncomfortable part included: who had it, and for how long.

Then it proposed the fix, as a list of agents. A contract watchdog scanning every unsigned envelope daily. A deal-hygiene check flagging any deal in a buying stage with no price or no recent activity. An engagement-burn trigger that creates the extension deal before the work goes cold. A post-meeting agent that pulls commitments and dates from every transcript. Each one mapped to a leak it had just seen in his own data.

And then it said the thing this article is about. It flagged his SOP library, years of refined, detailed standard operating procedures, as the most automatable asset in the company. I told him on the call: if these are really detailed, they can become agents. He builds operating discipline for a living, and his recurring frustration with his own team had always been, in his words, "why are we not following the new SOPs?" The answer was sitting in the audit. Humans skip steps when they get busy. An agent does not get busy. It runs step four every time, at the same standard, and logs what it did.

The document his team ignores becomes the employee that never ignores it. But only after the vetting, because most of those documents are partly obsolete, and converting them as written would automate a world that no longer exists.

The vetting: your SOPs are gold, and they are rotting

An SOP is an agent specification written by someone who never heard the word agent: trigger, steps, owner, output. The thinking inside it is the asset. The instructions inside it are the problem, because most were written for human hands, and the shelf life of a human-hands instruction just collapsed.

Take two real types of steps you will find in any SOP library.

The click. "Log into the CRM. Open the deal. Check whether the amount field is filled." That instruction is not wrong, it is obsolete, the way "rewind the tape" is obsolete. No human needs to click through the CRM anymore. The agent reads it directly, every morning, across every deal at once. Every log-into, open, check, scroll, and copy-paste in your library is a step that stops existing the day the agent connects. Even the final "send the status to so-and-so" step dies, because that person's agent reads the same activity log and briefs them itself. The FYI does not get automated. It dissolves. (The step-by-step autopsy of a real five-step SOP is in the HASOP manifesto.)

The form. "After each client session, the consultant fills out the weekly status form: percent complete, blockers, next steps." This one is obsolete in a more expensive way. The form captures the skeleton and loses everything that matters: the client's CFO asked about phase-two pricing twice, the sponsor sounded tired, the plant manager mentioned a second line they are thinking about. Those are continuation signals, upsell signals, risk signals, and no form field holds them. Forms were designed for humans who could not process nuance at scale. Agents can. The replacement is not a better form. It is voice: a two-minute memo in the car after the session, in natural language, with the nuance intact. The agent mines it for commitments, buying signals, and risk, and updates the client playbook the same hour. Richer input in, real-time insight out.

So the line-by-line test for every step in every SOP is three verdicts:

  1. Is this step a click? Delete it. The agent's lane absorbs it through the connection.
  2. Is this step a form? Replace the capture. Voice and transcripts in, structured insight out.
  3. Is this step a judgment? Keep it. It goes in the human lane: approve the escalation, handle the exception, make the call when the relationship needs a human voice.
Flowchart: one line of your SOP branches into click, form, or judgment; clicks go to the agent lane, forms become richer capture with voice in and structured insight out, judgments stay in the human lane
The vetting: click, form, or judgment. Motion moves to the agent, decisions stay human.

A converted SOP also needs three things intact: an explicit trigger ("any contract unsigned seven days," not "when things feel off"), inputs that live in systems the agent can reach, and a defined output with a named human approval owner for anything that touches a client or moves money. If the trigger or output is fuzzy, sharpen the document. If the inputs are unreachable, the fix is a connection, not a rewrite.

The rebuild: the Human-Agent SOP (HASOP)

What survives the vetting is the standard. What gets rebuilt is the procedure. I call the rebuilt document a HASOP, a Human-Agent SOP: one procedure, two lanes. The agent's lane carries the motion, the watching, checking, chasing, drafting, and logging. The human's lane carries the decisions. The old SOP described keystrokes. The HASOP describes responsibilities. (The full argument for why the SOP era is over is its own post: Your SOPs are useless now. It's not SOP, it's HASOP.)

The old SOPThe HASOP rebuild
1. Log into the e-signature platform
2. Check outstanding envelope statuses
3. Note anything older than a week
4. Email the client a reminder
Agent lane: watch envelopes daily, escalate anything unsigned 7 days with the deal value and signer contact attached, draft the reminder, log every action
Human lane: approve or adjust the reminder, decide when a phone call beats an email

Here is the full rebuilt document for the contract procedure, the watchdog the agent proposed in that session. Notice what is missing: not a single log-in, click, or screen anywhere in it. This is the document your team hands to Claude Code as a reusable skill.

Contract follow-up, HASOP v2 (replaces the old SOP)

Standard: no contract sits unsigned past 7 days without action. (Unchanged. The standard never changes, only who carries it.)

Trigger: every business day, 7:00 am.

Agent lane
  1. Pull every outstanding envelope from the e-signature platform.
  2. Cross-reference each against its deal in the CRM: value, owner, kickoff dates.
  3. Flag anything unsigned 7 or more days.
  4. Draft the reminder in the owner's voice, referencing the engagement's own context.
  5. Deliver one morning summary, and log every check, draft, and send.
Human lane
  1. Approve, adjust, or hold each reminder. One reply.
  2. Anything past 14 days or above $25,000: decide whether this one deserves a phone call instead.

Escalation: 21 or more days, or silence after two reminders, goes to the weekly leadership huddle with the full history attached.

Audit: every action is in the log. Anyone can answer "what happened with this contract" in one question.

And here is what it means in your week. Nobody logs into the e-signature platform anymore. At 7:00 every business morning, Claude Code pulls every outstanding envelope, cross-references each against the deal in your CRM, and if everything is moving, you hear nothing. The morning something has sat too long:

What lands in your inbox, Monday 7:02 am

2 contracts need attention.

Services agreement, $42,500, unsigned 9 days. Signer: the COO, direct line on file. A reminder is drafted in your voice, referencing the kickoff date you both agreed to.

Renewal addendum, $18,000, unsigned 7 days. Signer went quiet after opening it twice. Reminder drafted.

Reply SEND to release both, HOLD to pause, or tell me what to change. Everything above is logged.

You read it with your coffee, reply "send," and the chasing is done. The same pattern runs the deal-hygiene HASOP: one line in the same morning message reads "Deal in closing stage, client said yes 11 days ago, the revised document has been in internal review for 9 of them, owner flagged." The leak from the audit, structurally unable to happen again.

The numbers to watch: for contracts, median days-to-signature and the share signed inside seven days. For deal hygiene, the count of deals in a buying stage missing a price or stalled past five days. You probably cannot quote any of these today. Within a month you will know all of them, and you will watch them move.

Read the two lanes in the HASOP again and notice the trade. The agent got five steps. The human got two, and the two that stayed human are the only two that ever required a brain: the approval and the relationship call. That lane is where your team's new value lives: supervising the standard instead of performing it.

The Test, Refine method: how the agent earns its lane

These procedures touch contracts and money, so autonomy is granted on evidence, never on faith. The method I use with every client has a name now, because it works the same way every time: Test, Refine. Baseline it, refine the misses into principles, re-test that the principles stick, then release one tier at a time. (The method has its own full article, with the complete client story and the paste-able baseline test.)

Here is the method run on a real agent, with a client of mine: a consultant with enterprise clients, drowning in email, and a self-described control freak about her own voice.

  1. Test the baseline on real history. 25 real emails. The agent decides: respond, delete, or escalate, and grades each email's priority level, 1 to 3. No drafting yet, just decisions. Score: 25 out of 25. Now we know the judgment layer works before a single word gets written.
  2. Let it produce. Another 25, this time drafting the replies it believed were needed, in her voice, from her own email history. She would have sent all but four as-is.
  3. Refine the misses into written principles. Not one-off edits, principles the agent carries forward. One of hers was as small as pronouns: where she says "we" versus "I." That distinction became a standing rule in the agent's instructions, not a correction she would have to make twice.
  4. Re-test to prove the principles stick. A fresh batch, specifically checking that the refinements hold without being reminded. This is the step everyone skips, and it is the difference between an agent that learned and an agent that got lucky.
  5. Release the lowest tier. Level 1 emails, the routine ones, ran without her for a few days while she watched the log.
  6. Expand on a schedule, not a feeling. The scores held, so that Saturday the agent started handling level 2 as well. Tier by tier, on evidence.
The Test, Refine method in six steps: test the baseline on 25 real cases, let it produce, refine misses into written principles, re-test that the principles stick, release the lowest tier, expand on a schedule
The Test, Refine method. Steps three and four are the ones everyone skips.

And I will tell you what the release step actually felt like, because pretending it is easy helps nobody. When she typed the instruction that let the level-one emails run without her, she stopped at the enter key and said, "I can't." Her stomach turned. She pressed it anyway, and when we scheduled the next tier for the weekend, she said she would need a drink in her hand. That feeling is not a problem with the method. That is the method working: release is supposed to be earned, and it is supposed to cost something.

A control freak granted autonomy in seven days, not because she relaxed, but because the agent posted scores at every step. That is the model for every HASOP. Three rules hold from day one: every action is logged, anything client-facing or financial waits for a human reply, and you can switch the whole thing off in one step. For the longer version of what this does to the admin layer, read the honest answer on agents and your admin team.

And where does this agent live, since every CEO asks: Claude Code is Anthropic's agentic AI tool. It is not new software you buy and roll out company-wide. You give it a goal, it reaches your systems through the connectors you authorized, it runs on a schedule, and by default it asks before it acts.

System map: Claude Code runs on your computer on a schedule, connected to your CRM, email, calendar, e-signature, and meeting transcripts, delivering one morning summary to your inbox at 7:00 am
Where the agent lives: your computer, your connections, one morning summary.

The real premise: your SOPs now expire in weeks, not years

Step back, because the vetting points at something bigger than a one-time conversion.

Standard operating procedures were built for a world where the tools changed slowly. You wrote the procedure, it stayed current for years, and you revised it at the annual ops review. That cadence is dead. A procedure written six months ago says "log into the CRM," which is already a dead step. A capture form designed last year is already losing signal a voice memo would catch. The tools, the connectors, and the capabilities now change monthly, which means the half-life of a keystroke instruction collapsed from years to weeks. Your SOP library is simultaneously your most valuable asset (the thinking) and your fastest-rotting artifact (the instructions).

That is why "standard" is no longer the ambition. The ambition climbs a ladder:

  1. Standard ops. Where almost every company sits today. Procedures written down, executed by humans when they remember, revised every few years, decaying quietly in a folder.
  2. Live ops. The HASOP stage. The procedure is no longer a document someone is supposed to follow. It is a system that runs, every business day at 7:00 am, whether anyone remembers or not.
  3. Self-improving ops. Every run is logged, and every correction teaches the procedure. Fix a few drafts out of twenty-five, and the refined instruction becomes the new standard automatically, inherited by every agent and every person who touches that work after you.
  4. Sentient ops. Not conscious: aware of state, including its own. The procedure notices when it has gone stale and asks to be revised. A new connector comes online, and steps that sat in the human lane because the agent could not see a system move over. An approval step runs forty straight times with no edits, and the agent flags it: this one is ready for my lane. A richer capture method appears, and a form dies. The HASOP dance (vet, reassign, rebuild) stops being a project you run once and becomes a rhythm the operation runs on itself, every few weeks, with every change logged.

That last rung is the real shift. The old SOP library was rewritten once every few years by whoever got assigned to it. The sentient version re-vets itself on the cadence the tools actually change, and your job moves from rewriting procedures to approving their evolution.

Where this gets hard

I will not pretend this is smooth, because the smooth version of this story is the one that fails in week three. Four things are worth knowing before you start, not after.

The connection is the hard part, not the agent. "Connect your CRM and email" is one sentence and often several days of work: authentication, permissions, rate limits, and a stack messier than anyone admits. And a connector can say "connected" while only passing summaries instead of full records; meeting tools are notorious for this. When an answer comes back thin, the cause is usually shallow access, not a dumb agent. Fixing the access is the real project. And expect friction that feels dumb: a login that expires every few weeks and needs you at your desk to renew it, a setting that resets after an update, an afternoon lost to one connection. That is normal, not failure, and it is an afternoon, not a project plan.

Your data debt surfaces fast. The morning the agent starts reading your CRM, it exposes every blank field, stale record, and half-entered deal. For the first few weeks it creates cleanup before it creates calm. That is not the agent failing. That is the agent showing you the truth your old SOP let you ignore.

The agent only knows what gets captured. Unrecorded calls, hallway decisions, and deals that live in someone's head are invisible to it. This is exactly why the form-to-voice upgrade matters: record the meetings, and after the ones you cannot record, the owner speaks a two-minute memo. The capture habit is operational discipline, and it is non-negotiable. Nuance in, insight out.

Someone has to own the dance. Even at the sentient rung, the agent proposes its own revisions; a human approves them. Without a named owner and a regular look at the log, attention fades. In my experience most owners stop reading the 7:00 am summary by week four unless someone owns it, and an unwatched system drifts. You are not removing the discipline. You are moving it from "follow the steps" to "supervise the evolution."

Who this is not for yet

I would rather lose you here than waste your quarter. This is a poor fit if any of these describe you right now.

You have no documented process and no one who can clearly walk through how the work happens. There is nothing to vet until that exists, though a voice memo of the person who does the work is often enough to draft the first SOP and start.

You are a solo founder or a very small team with no one to own the agents after the first build. You will become the bottleneck you were trying to remove.

You operate in heavily regulated or high-liability work (certain legal, healthcare, financial, or audited government contracting) where one wrong client-facing action carries real exposure. The method still applies, but you need governance and review well beyond what a blog post can give you.

Your systems have no integration surface: email used as a database, spreadsheets as the system of record, on-prem tools with no API. You need a data and systems step before an agent step.

You want autopilot with zero oversight. This pays off in the supervising. If you will not run the scored tests and approve the revisions, the method will disappoint you, and the fault will not be the method's.

How to run this, starting this week

You go first, and not only for the mindset shift. Your account is the only one that sees the whole business: every inbox, every deal, every file. The full audit is only possible from your seat. An employee's agent sees their slice. After you have felt it work once, your team scales it.

  1. Connect the systems your business runs on. CRM, email, calendar, meeting transcripts, e-signature, files. Budget real time. This step decides everything that follows.
  2. Run the strategic prompts. Have it pull your best old proposal and tell you what is weak. Ask what you are missing in the pipeline right now, and push it. Ask it to audit how your company's knowledge is organized. You are proving access and intelligence before you build anything, and the answers will reset your standard for what your systems can do. The 12 revenue prompts are a ready-made battery for this step.
  3. Have it audit your SOP library. Ask it to vet every procedure line by line: which steps are clicks it can absorb, which are forms that should become voice capture, which are judgments that stay human. Ask which SOPs convert first. The revenue ones (contract follow-up, deal hygiene, renewals) almost always lead.
  4. Convert one. Have Claude Code propose the HASOP version: trigger, agent lane, human lane, escalation path, audit log. Test it against your last five real cases before it touches anything live. Refine about three times, then run it with a human approving anything client-facing. Write down the one number you will watch.
  5. Run the Test, Refine method: baseline on real history, refine misses into written principles, re-test that they stick, release the lowest tier, expand on a schedule.
  6. Hand the library to your team, and set the cadence. The vetting is not a one-time project. Put the re-vet on a rhythm now, because the agent will start proposing its own revisions, and someone has to approve the evolution.

If an SOP only lives in someone's head, record a three-to-five-minute voice memo of the person walking through the real work, and let Claude Code draft the SOP from it. Out-of-date documentation is not a blocker. It is the first task you hand the agent.

When your team is ready (most are not yet)

Here is the honest calibration from inside these implementations: for the first months, you will be the most advanced person in your company at this. One owner I work with rebuilt her own website infrastructure with her agent, and her longtime ops team told her, "we can't help you anymore, we don't know what you're talking about." Your team is not behind because they are slow. They are behind because you are the one getting the reps.

So do not forward a framework. Forward a working agent. The handoff that actually works:

  1. Run the first HASOP yourself until it is boring. If approving the morning summary still feels novel to you, it is too early to hand it to anyone.
  2. Give one person one working agent, not the method. Their entire job fits in a sentence: watch the morning summary, reply SEND or HOLD, flag anything that looks wrong.
  3. Let them watch you correct it once. One working session where they see you turn a miss into a written principle teaches more than any document you could send.
  4. Then raise the bar. When the first agent is boring to them too, hand them the second one, and the standard travels with it.

And when someone on your team is genuinely ready to take one over, this is the message that works at the level they are actually at:

The handoff, ready to send

"I set up an assistant that watches our unsigned contracts every morning and drafts the reminder emails. Nothing sends unless we reply SEND. Your job this week: read the 7:00 am summary with me, approve or hold each reminder, and tell me anything that looks off. Once this feels routine, we will pick the next process and build it the same way: test it on past examples, fix what it gets wrong, test that the fixes stick, then let it run a little more on its own. The goal is not speed. The goal is that you trust it because you graded it."

Notice what is not in that message: no vetting, no lanes, no cadence, none of the vocabulary in this post. The model stays in your head until their reps put it in theirs. That is not dumbing it down. That is how you went first.

Common questions

Can SOPs really become AI agents?

Yes, if they are detailed, but not as written. An SOP already holds the structure an agent needs: a trigger, steps, an owner, and an output. First vet it line by line: steps that are clicks get absorbed by the agent, steps that are forms get replaced with richer capture like voice, and steps that are judgments stay human. Then Claude Code proposes the rebuilt two-lane version.

How do I know if my SOPs are outdated?

Read any procedure and look for two patterns. Click instructions (log into, open, check the field) are obsolete the day the agent connects to that system. Form instructions (fill out the status report) are obsolete because forms capture the skeleton and lose the nuance agents now mine for upsell, continuation, and risk signals. If your SOPs are full of clicks and forms, they are outdated. The thinking in them is still the asset.

What replaces forms and status reports?

Voice, mostly. A two-minute memo after a client session, in natural language, carries the buying signals, hesitations, and side comments that no form field holds. The agent mines it for commitments, risk, and opportunity, and updates the client record the same hour. Forms were designed for humans who could not process nuance at scale. Agents can.

Which SOPs should I convert first?

The ones closest to revenue with a clear trigger and a deadline: contract follow-up, then proposal and deal hygiene, then renewals and extensions. They convert cleanly because the rule is already explicit, and the payoff shows up in weeks, not quarters.

How does the agent earn the right to act on its own?

Through the Test, Refine method, not time served. Test the baseline on 25 real cases and grade its decisions. Let it draft 25 more. Refine the misses into written principles, then re-test a fresh batch to prove the principles stick without reminders. Release the lowest-stakes tier first and expand on a schedule as the scores hold. One of my clients, a self-described control freak, released her lowest-priority email replies within a week: 25 for 25 on triage, drafts where she refined only four, and refinements that held on the re-test.

What is a Human-Agent SOP (HASOP)?

A HASOP is the rebuilt version of a standard operating procedure for the agent era: one procedure, two lanes. The agent's lane carries the motion: watching, checking, chasing, drafting, logging. The human's lane carries the decisions: approvals, exceptions, relationships. Classic SOPs described keystrokes. A HASOP describes responsibilities, because the keystrokes are gone.

How often does a HASOP need to be revised?

Every few weeks, not every few years. That is the real shift. Connectors, capture methods, and agent capabilities now change monthly, so procedures go stale on that cadence. At the mature stage, the agent flags its own revisions: an approval step that has run forty times with no edits, a newly connected system, a form that should become voice. A human approves each change, and every change is logged.

What is sentient ops?

The top rung of the ladder that runs standard ops, live ops, self-improving ops, sentient ops. Not conscious: aware of state, including its own. The operation watches the business and raises problems before anyone asks, and it notices when its own procedures have gone stale and proposes the revision. The HASOP dance becomes a rhythm the operation runs on itself, with human approval on every change.

What is Claude Code, and where does the agent run?

Claude Code is Anthropic's agentic AI tool. You give it a goal, connect it to the systems you already run, and it does the work and reports back. It is not software you buy and roll out company-wide. It runs from a computer, yours first, then a machine your team manages, on a schedule, using the connectors you give it to your CRM, email, calendar, and e-signature tool.

Do I need to be technical, and does my team?

No. You go first to reset your standard for what running an SOP now means, not to become technical. There is also a structural reason: only your account sees the whole business, so the full audit is only possible from your seat. Expect to be the most advanced person in your company at this for the first months. That is normal, and it is the point of going first. Your team inherits working agents, not frameworks.

Is our data safe when Claude Code connects to our systems?

Claude Code acts only through connectors you authorize, and by default it asks for approval before any action that changes something or contacts a client. Every check, draft, and send is logged, so you can audit exactly what it did. You expand its autonomy deliberately, one tier at a time, and you can switch it off in one step.

Do agents replace the people who ran these SOPs?

They replace the checking, chasing, and drafting inside the SOP, not the judgment around it. A person still approves what goes to a client. The honest shift is that your team stops executing procedures and starts supervising them, and approving their evolution, and some roles change because of it.

You spent years writing down how your company should run. The thinking is still gold. The instructions are expiring. Vet the library, wake it up, and put it on the cadence the world actually moves at now.


Keep going: the 12 prompts that find money in your business, the honest answer on agents and your admin team, and what Codex is.