Watch the live training this came from
This article is drawn from Shanee Moret's Day 2 live training on Codex, websites, agent-ready infrastructure, and real business-owner implementation.
Watch the replay →Heather Mackay sells about 35 dahlia tubers a week. Some Facebook posts, a modest email list, a side business she runs alongside her HR consulting firm. Nothing aggressive. No paid ads.
Over Mother's Day weekend, Codex sold 490.
She was not at her computer for most of it. She had set a goal of 200 — already a stretch she considered ambitious — and the agent hit it on Saturday alone, starting around 1:00 PM. By Sunday midnight the number was 490. Heather had reset the goal mid-weekend because she ran out of ceiling.
I want to be precise about what made this work, because the instinct is to attribute it to the technology. The technology is the same tool most business owners have access to right now. What was different was the setup — specifically, access given before the goal was activated, a defined and measurable target, and a deliberate decision not to intervene.
That combination is the step. This post breaks it down.
For the complete framework — including the mental model shift, platform decisions, and what happens when agents outpace your infrastructure — read the full guide. You can also watch me explain this live, including the real-time walkthrough of how this weekend unfolded.
The Structure Behind the Result
The Dahlia Weekend is a Provenance Argument. The outcome — 490 tubers, a 14x weekend on normal pace — was not determined when the first Facebook post went live. It was determined the day before, when Heather connected her second Gmail account, her Shopify store, and her Facebook page to Codex.
Access plus a defined goal, given to an agent that already knows your business, determines outcome before a single action is taken.
Once Codex was connected, it asked clarifying questions: Are these the only Facebook pages for dahlias? Can I look at your current Shopify customers? Those questions are the tell. They mean the agent is oriented. It is not executing randomly — it is mapping what it has access to against what it needs to reach the goal. If those questions never come, the environment is not complete.
Codex then built an outreach plan, confirmed access, and began posting to Facebook and emailing past and potential customers. Every five minutes it reported back with an updated order count. Heather stayed out.
That last sentence is the one business owners skip over. It sounds passive. It is not. It was a prior decision, built into the test design before anything went live. And it was harder than it sounds.
Why Staying Out Is the Work
Heather is, by her own description, controlling and impatient. Both traits are common in business owners who have built something real. They are also the two most reliable ways to prevent an agent from performing.
The impatient business owner checks in after 20 minutes, sees an imperfect output, and overrides it. The controlling one edits every message before it goes out. Both behaviors interrupt the agent's trajectory and reset the campaign. The result looks like the agent not working. The actual problem is the intervention.
Here is what the failure pattern looks like compared to what a clean test looks like:
| Behavior | What It Looks Like | What It Costs |
|---|---|---|
| Intervening at hour 1 of a 24-hour test | "I just want to check the first post" | Agent trajectory reset; test data destroyed |
| Editing outreach before it sends | "I'll let it draft, but I'll refine each one" | Agent volume eliminated; no test of autonomous reach |
| Checking results every 15 minutes | "I just want to see how it's going" | Interruption loop; agent flagged for re-instruction |
| Staying out for the defined window | Heather on Mother's Day, away from her computer | 490 tubers; 14x normal pace |
The non-intervention has to be pre-committed. Not willpower in the moment. A prior agreement, made before the first post goes live, about when evaluation happens and what the evaluation criteria are. If you do not make that agreement with yourself before the test begins, your operational instincts will win every time — because they have been trained for years to kick in exactly when something important is happening.
For more on how to structure this, see the two business owner failure modes and the environment setup that precedes any goal activation: connecting environments before activating a goal.
What the Goal Feature Actually Does
Setting a goal in Codex is not the same as writing a task list. A task list tells the agent what steps to take. A goal tells the agent what outcome to reach and leaves the path to it as the agent's problem to solve.
The goal Heather set was specific and measurable: sell 200 dahlia tubers over Mother's Day weekend. Not "post on Facebook" or "email customers." A number, attached to a time window. Codex built the strategy to reach it — which outreach channels to use, in what order, with what messaging — based on the context it already had about Heather's business.
This is the distinction that determines what you get back. If you give Codex a vague instruction, you get a generic output. If you give it a measurable outcome and the access to reach it, it generates a plan and executes against that plan — then reports on progress in real time.
Every five-minute update Heather received during the test was not a notification. It was a status report: current order count, what actions had been taken, what was queued next. She could see the campaign running without touching it.
The Mid-Weekend Reset
Saturday night, before she went to bed, Heather had 204 tubers sold. Goal hit. She reset the target to 400 for the full weekend.
Sunday morning came. Sales continued. By Sunday midnight the number was 490 — across two days, with Heather away from her computer all of Mother's Day.
The reset is worth noting because it reveals something about how the goal feature works. The agent was not running a fixed campaign that happened to succeed. It was running toward a target. When the target changed, the agent updated its trajectory. Heather did not have to rebuild anything, restart anything, or re-explain the business. She changed the number. The agent reoriented.
For business owners with a proven offer and an existing customer list, this is the most direct illustration of what the goal feature makes possible. The 35-tubers-a-week baseline was not a limit imposed by demand or market size. It was a limit imposed by the hours available in a week to post, email, and follow up. Remove the hours constraint and replace it with an agent that has access to every channel — and the baseline becomes a floor, not a ceiling.
What This Requires Before You Start
The weekend worked because of what happened before it. Specifically:
- Two weeks of setup friction absorbed in advance. Heather's gaming computer had security features that kept blocking Codex. That problem was resolved before the test began — not during it.
- Three environments connected and confirmed. Second Gmail, Shopify, Facebook. Not assumed — connected, tested, and confirmed accessible.
- A measurable goal stated clearly. 200 tubers. Mother's Day weekend. Not "sell more" or "run a campaign."
- A deliberate non-intervention agreement. Made before the first post went live, not improvised once the test was running.
- Business context already in the agent. Codex knew the business. It did not need to be briefed on the offer, the customer base, or what a dahlia tuber is. That context was already there.
None of these are optional. Remove any one of them and the result changes. The access is what gives the goal somewhere to go. The goal is what gives the access a direction. The non-intervention is what lets both of them do their job.
The One Thing Business Owners Get Wrong Here
The mistake is treating the goal feature as a one-click shortcut. You write a goal, hit send, and expect the agent to run. When the result is mediocre, the conclusion is that the feature does not work.
What actually happened is that the environments were not fully connected, or the goal was vague, or the business owner intervened before the test window closed. The agent did not fail. The setup was incomplete.
Codex asking clarifying questions — "Are these the only Facebook pages for dahlias? Can I look at your current Shopify customers?" — is the sign that setup is complete and the agent is oriented. If you activate a goal and receive no questions, pause and audit the connections before assuming the agent is running.
The questions are the signal. No questions means no orientation. No orientation means no campaign.
Start Here
If you want to run a test like this, the sequence is:
- Choose one offer and one measurable outcome — a number, a time window, a completion state.
- Connect every environment the agent needs to reach that outcome (email, storefront, social, whatever applies).
- Wait for Codex to ask clarifying questions. Answer them fully.
- Write the goal as an outcome, not a task list.
- Set a non-intervention window — minimum 12 hours — in writing, before you activate anything.
- Stay out of it until the window closes.
The result will tell you what the agent can do with proper access. That is the only way to know.
For everything that comes before this step, including the mental model shift required to deploy agents rather than just ask them questions, read the full guide.
This is Part 11 of a 43-part series. Start from the beginning.
Use the skills behind this system
The Growth Academy Skills Dashboard includes 100+ Codex skills and prompts for SMB owners, including website audits, GitHub and Cloudflare setup, permissions, business intelligence, sales, and operations workflows.
See the Skills Dashboard →