Claude Goals vs Workflows: When to Set a Goal and When to Run a Workflow

Split-panel title header on dark teal with green accents reading 'Claude Goals vs Workflows'.

Share This Post

TL;DR

  • A Goal (/goal) points one agent at a finish line and lets it keep going, turn after turn, until a separate model confirms the condition is met. This is depth: one agent, working a problem until it is done.
  • A Workflow has Claude write a JavaScript script that orchestrates dozens to hundreds of subagents in parallel, runs it in the background, and hands you one synthesized result. This is width: many agents fanning out across a problem too big for a single conversation.
  • Pick the goal when “done” is a verifiable end state and the work is sequential.
  • Pick the workflow when the work splits into independent pieces, or when you want the orchestration written down as something you can read, edit, and rerun.

This is the practical guide we wish we had when we started leaning on both for real marketing operations work at Elevarus. We will cover what each one actually is, how to use it, when to reach for one over the other, where each falls down, and how the same depth-versus-width split shows up in the AI systems we run every day.


The one-sentence version

The cleanest way to hold these two apart is to ask a single question: who holds the plan?

  • With a Goal, Claude holds the plan in its head. You give it a condition, and it decides turn by turn what to do next to satisfy that condition. The plan lives in the conversation.
  • With a Workflow, a script holds the plan. Claude writes that script up front, a runtime executes it, and the orchestration logic, the branching, and the intermediate results all live in code rather than in a context window.

Everything else, the cost profile, the speed, what you can reuse, the failure modes, falls out of that one difference. Hold onto it.

Key Concept. A Goal (/goal) runs one agent in a loop until a separate model confirms a condition you define. That is depth. A Workflow has Claude write a script that orchestrates many parallel subagents and returns one synthesized result. That is width.

What a Goal is

/goal is a Claude Code command (it needs version 2.1.139 or later). You give it a completion condition, and Claude keeps working across turns, on its own, until that condition holds. After every turn, it does not stop and hand control back to you.

Instead, a small fast model (Haiku by default) reads the condition and the conversation so far and answers one question: is this done yet? If the answer is no, that model returns a short reason, and Claude starts another turn using the reason as guidance. When the answer is finally yes, the goal clears itself and the work stops.

Under the hood it is a wrapper around a session-scoped Stop hook, but you do not need to think about hooks to use it. You think about the finish line.

How to use it

Setting a goal is one line. The condition is the directive, so you do not send a separate prompt afterward:

/goal all tests in test/auth pass and the lint step is clean

A few mechanics worth knowing:

  • One goal per session. Running /goal again with a new condition replaces the old one.
  • Check status by running /goal with no argument. You see the condition, how long it has run, how many turns have been evaluated, the token spend, and the evaluator’s most recent reason.
  • Stop early with /goal clear. The words stop, off, reset, none, and cancel all work as aliases, and starting a fresh conversation clears it too.
  • Bound the run by writing the limit into the condition itself, for example ... or stop after 20 turns. Without a bound, a vague condition can loop longer than you want.
  • Run it unattended in headless mode. claude -p "/goal CHANGELOG.md has an entry for every PR merged this week" runs the whole loop to completion in a single invocation, which is how you wire a goal into a script or a scheduled job.

The condition can be up to 4,000 characters, so you have room to be specific.

Writing a condition that actually works

This is the part people get wrong, and it is worth slowing down on. The evaluator judges your condition against what Claude has surfaced in the conversation. It does not run your tests, open your files, or check your database on its own. So a condition only works if Claude’s own output can prove it.

“All tests in test/auth pass” is a good condition because Claude runs the tests and the result lands in the transcript where the evaluator can read it. “The feature works correctly” is a bad condition, because nothing in the transcript can demonstrate it and the loop will either stop too early on a confident-sounding turn or never stop at all.

A condition that holds up across many turns usually has three things:

  1. One measurable end state. A test result, a build exit code, a file count, an empty queue.
  2. A stated check. How Claude should prove it, like “npm test exits 0″ or “git status is clean.”
  3. The constraints that matter. Anything that must not change on the way there, like “and no other test file is modified.”

When to reach for a Goal

Goals are for substantial, sequential work with a finish line you can describe:

  • Migrating a module to a new API until every call site compiles and the tests pass.
  • Implementing a spec until all of its acceptance criteria hold.
  • Splitting a sprawling file into focused modules until each is under a size budget.
  • Grinding through a labeled backlog until the queue is empty.

The common thread is that each step depends on the last, and you can name the condition that means “stop.” If you cannot write that condition cleanly, a goal is the wrong tool.

Where Goals fall down

  • Cost scales with depth. A goal that takes twenty turns costs roughly twenty turns’ worth of tokens. The evaluator itself is cheap, it runs on the small fast model, but the worker turns are not.
  • It is single-agent and sequential. There is no parallelism. A goal works one problem deeply; it does not spread out across many.
  • Vague conditions misbehave. Too loose and it loops or converges on the wrong reading of “done.” The fix is always a sharper condition, not a smarter prompt.

One more nuance, because it matters in practice: /goal is not the same as “auto mode.” Auto mode approves tool calls within a single turn so the agent runs unattended. A goal decides whether a new turn should start at all. They are complementary. Auto mode removes the per-tool prompts; the goal removes the per-turn prompts; together you get a hands-off run that stops on a real condition instead of on the worker’s own say-so. That separation, a fresh model deciding “done” rather than the agent grading its own homework, is the quiet strength of the whole feature.

Operator Note. Bottom line on Goals: one agent, going deep, until a separate model confirms a finish line you can name. Reach for it when you can write down what “done” looks like in a way the work itself proves.

What a Workflow is

A dynamic workflow is a JavaScript script that orchestrates subagents at scale. You describe the task, Claude writes the script, and a runtime executes it in the background while your session stays responsive. Workflows are in research preview and need Claude Code 2.1.154 or later; they run on all the paid plans and across the major cloud providers.

The mental model is the inverse of a goal. Instead of one agent going deep and holding the plan in its head, a script holds the plan and spawns many agents to go wide. Each agent’s result lands in a script variable rather than in Claude’s context, so the conversation only ever sees the final answer, not the hundred intermediate ones.

That single design choice, the plan lives in code, is what makes workflows different from the other ways to run multi-step work:

Subagents Skills Agent teams Workflows
What it is A worker Claude spawns Instructions Claude follows A lead supervising peers A script the runtime executes
Who decides what runs next Claude, turn by turn Claude, per the prompt The lead agent The script
Where results live Claude’s context Claude’s context A shared task list Script variables
What is repeatable The worker definition The instructions The team definition The orchestration itself
Scale A few tasks per turn Same A handful of peers Dozens to hundreds of agents

Because the plan is code, a workflow can do something the others cannot: apply a repeatable quality pattern, not just run more agents. It can have independent agents adversarially review each other’s findings before anything is reported. It can draft a hard plan from several angles and weigh them against each other. You get a more trustworthy result than a single pass, every time you run it.

How to use it

The fastest way to feel it is the bundled one:

/deep-research What changed in the Node.js permission model between v20 and v22?

That fans web searches across several angles, fetches and cross-checks the sources against each other, votes on each claim, and returns a cited report with the claims that did not survive cross-checking already filtered out. You watch it run, in the background, while your session stays free.

To run a workflow for your own task, you have a few options:

  • Ask for one in your prompt. Include the keyword ultracode, or just say “use a workflow,” and Claude writes a script for the task instead of working it turn by turn. For example: ultracode: audit every API endpoint under src/routes/ for missing auth checks.
  • Let Claude decide. Set /effort ultracode and Claude plans a workflow for every substantial task in the session. A single request can become several workflows in a row, one to understand the code, one to make the change, one to verify it. It uses more tokens, so you drop back to /effort high for routine work.
  • Run one you saved. More on that below.

While it runs, /workflows opens a progress view: each phase with its agent count, token total, and elapsed time, and you can drill into any single agent to see what it found. You can pause, resume, or stop the run, or stop one misbehaving agent, all from that view.

Save it and run it again

This is the part that turns a one-off into infrastructure. When a workflow does what you wanted, open /workflows, select the run, and press s to save its script as a command. Save it to .claude/workflows/ to share it with everyone who clones the repo, or ~/.claude/workflows/ to keep it personal across all your projects. From then on it runs as /, right alongside /deep-research in autocomplete.

Saved workflows take input through a global named args:

Run /triage-issues on issues 1024, 1025, and 1030

Claude passes that list as structured data, so the script can call array and object methods on args directly. A review you run on every branch, a triage you run on every batch of issues, a research format you reuse for every topic, each becomes a single command that runs the same orchestration each time.

When to reach for a Workflow

Workflows are for work that is too big for one conversation to coordinate, or work whose orchestration is worth writing down:

  • A codebase-wide sweep: audit all 150 endpoints, or every published article, in parallel.
  • A large migration: 300 or 500 files moved to a new pattern at once.
  • A research question that needs sources cross-checked against one another rather than taken at face value.
  • A hard decision worth drafting from several independent angles before you commit to one.
  • Any repeatable process where you want the same quality pattern applied every time.

Where Workflows fall down

  • Token cost is real. Many agents means meaningfully more tokens than doing the same task in conversation. The standard move is to run it on a small slice first, one directory, one narrow question, watch the per-agent spend in /workflows, then decide whether to run it wide.
  • There is no mid-run input. Only an agent’s own permission prompt can pause a run. If you need a human sign-off between stages, run each stage as its own workflow.
  • Resume is session-bound. You can pause and resume within the same session, and completed agents return cached results, but if you quit Claude Code the next session starts the workflow fresh.
  • There are hard caps. Up to 16 agents run at once (fewer on a machine with limited cores), and 1,000 agents total per run. Those caps are a feature, they bound the blast radius of a runaway script, but they are real limits to plan around.
Operator Note. Bottom line on Workflows: many agents, going wide, driven by a script you can read and rerun. Reach for it when the work splits into independent pieces, or when the orchestration is worth saving and running again.

Goals vs Workflows: the decision

Strip away the detail and it comes down to the shape of the work.

Factor Goal (/goal) Workflow
Shape One agent, deep, sequential Many agents, wide, parallel
Who holds the plan Claude, in context A script, in code
Best for A verifiable finish line Independent pieces, or a repeatable process
Repeatable The instruction, not the plan The entire orchestration
Cost driver Number of turns (depth) Number of agents (width)
Speed Sequential, slower on broad work Parallel, fast on independent work
Stops when A fresh model confirms the condition The script finishes

A working decision sequence:

  1. Is the work sequential, each step depending on the last, with a finish line you can name? Set a goal.
  2. Does the work split into independent pieces you could run side by side? Run a workflow.
  3. Is the condition vague or open-ended? Do not set a goal. Sharpen it first, or do the work by hand.
  4. Will you run this again? Lean toward a workflow and save it, so the orchestration becomes a command.
  5. Do you need the answer fast and the pieces are independent? Workflow, for the parallelism.
  6. Do you need the lowest token cost on a bounded problem? Goal, for the smaller footprint.

And yes, they compose. The advanced pattern is a workflow that manages width, spawning the workers, where a worker uses a goal internally to go deep on its slice. You rarely need that on day one, but it is there when a problem is both wide and deep.

Quick Win. Want to feel the difference in five minutes? Run /deep-research on a real question you have right now. It is the lowest-friction way to watch a workflow fan out, cross-check its sources, and hand back one cited answer, with zero setup. Then set one tightly scoped /goal (a “fix this page until it clears the bar” task) to feel the depth loop.

The same split, lived: what we learned building ElevarusOS

We run an internal AI system called ElevarusOS that researches, writes, fact-checks, and publishes our marketing content and renders our short-form video. Building it taught us the depth-versus-width split the hard way, long before these two features had names, and seeing our own stack through this lens is the clearest way we know to explain when to use which.

Our content engine is a hand-built workflow, and that is the point. To be precise about it, because the distinction matters: our blog pipeline is not running on Claude Code’s dynamic workflows. It is our own deterministic orchestrator, written in TypeScript, with the model called through the API inside the individual stages. It is a fixed sequence: scan the news, research the keyword, outline, draft, fact-check, edit, fact-remediate, generate the image, publish, sync the SEO fields, hand off for review.

We wrote that orchestration ourselves, before dynamic workflows existed as a built-in, because the pattern is too useful to do without. And that is the strongest case we can make for the feature: the shape is so valuable that teams hand-roll it. The plan is not held in a model’s head; it lives in code, so we can read it, test it, version it, and trust that stage nine always runs after stage eight.

That is exactly the promise dynamic workflows now ship by default: the plan lives in a script, intermediate results live in variables, and the whole thing is repeatable. If we were standing up that orchestration today, a saved workflow is the first thing we would reach for. The lesson transfers either way: when you want a process to be reliable and auditable, move the plan out of the conversation and into code, whether you write that code yourself or save the script Claude writes.

Our editor is a goal-shaped loop. Separately, we run an autonomous “managing editor,” a long-running Claude Code session that reviews every published article, fixes what is safe to fix on the live site, and keeps the corpus moving toward better rankings. This one does run inside Claude Code, and it fans each cycle’s work out to a set of focused subagents. It does not hold a fixed script; it works toward a standing objective, one bounded cycle at a time, and stops when there is nothing left to safely do. That is the goal pattern in spirit: point an agent at a finish line and let it keep working.

But we pace it differently, and the reason is instructive. Our cycles run on a timer, the /loop pattern, not on a single completion condition. /goal stops when a fresh model confirms the condition is met; /loop starts the next pass when an interval elapses and runs until you stop it or the work is done. We chose interval pacing because “the whole site is perfect” is not a condition any model can honestly confirm, so there is no clean finish line to evaluate against.

If your objective has a real finish line, use a goal. If it is a standing duty with no end state, a loop or a schedule fits better. Picking the wrong one of those is a common and quiet mistake.

There is also a hard-won safety note buried in there. We recently let that editor publish a held draft on its own only after wrapping the authority in deterministic guardrails: it may publish only if the content clears a checkable floor (no unverifiable claims, real citations present, valid structured data), and it escalates to a human otherwise. That is the same instinct you want with both Goals and Workflows. The agent gets to be creative and autonomous inside hard, checkable boundaries. The autonomy is the point; the boundary is what makes the autonomy safe to ship.


Marketing examples, concretely

Most of the published examples for these features are about code, because that is where Claude Code started. But the patterns map cleanly onto marketing and content operations. Here is how a team actually applies each. Treat these as templates, not benchmarks; the point is the shape of the work, not a promised number.

Reach for a Goal when there is a finish line

  • Fix until clean. “Repair the FAQ structured data on this post until it validates and the page has exactly one H1, without changing any claim.” One agent, working a single page until a checkable condition holds.
  • Rewrite to a bar. “Revise this landing page until it reads at an 8th-grade level, every statistic has a real cited source, and the primary CTA appears above the fold.” Each criterion is something the agent’s own output can demonstrate, which is exactly what a goal needs.
  • Drain a backlog. “Work through the list of outdated posts until each one has a refreshed publish date, a current statistic, and no broken links, or stop after 30 turns.” A bounded, sequential grind with a clear stop.

The tell, every time, is that you can write the condition. If you can name what “done” looks like in a way the work itself proves, a goal will carry it.

Reach for a Workflow when the work fans out

  • Audit the whole library at once. “Across all 200 published articles, find every post with thin internal linking, a missing meta description, or broken schema, and return a prioritized list.” This is hundreds of independent checks. A goal would crawl them one at a time; a workflow runs them side by side and hands you one report.
  • Research a topic you can trust. Point /deep-research at “the current state of consent and compliance rules for lead-generation calls” and let it fan out across sources, cross-check them against each other, and filter the claims that do not survive. For a regulated space, the cross-checking is the whole value.
  • Generate, then judge. “Draft five distinct ad angles for this offer, then have independent agents score each against our brand voice and our compliance rules, and return the two that survive.” This is the generate-and-filter pattern: the creativity fans out, then an adversarial pass culls it. You get a vetted shortlist instead of one confident guess.
  • Migrate at scale. “Move all 300 product descriptions to the new template and schema, preserving every existing claim.” A single large migration, codified once, runnable again the next time the template changes.

Save the ones you will repeat. A monthly content audit, a per-campaign angle generator, a compliance research format, each becomes a / command your whole team runs the same way every time.

A combined picture

A realistic content operation uses both. A workflow audits the entire blog library every month and produces a ranked list of pages that need work. Then, for each page that lands at the top of that list, a goal drives a focused fix: rewrite this one until it clears the bar. Width to find the work, depth to do it. That is the natural division of labor once you stop thinking of them as competitors and start thinking of them as a wide tool and a deep tool.


Common mistakes

  • Using a goal for a wide problem. If the task is “check 150 things,” a goal will plod through them in sequence while a workflow would have finished in parallel. Width wants a workflow.
  • Using a workflow for a one-line job. Spinning up a multi-agent run to fix a single file is more tokens and more ceremony than the job needs. If one agent can do it, let one agent do it.
  • Writing a condition the evaluator cannot see. “It works” is not provable from the transcript. Tie the condition to something Claude’s own output demonstrates.
  • Confusing “keep running” tools. /goal stops on a condition. /loop runs on an interval. A Stop hook runs your own check. Auto mode just removes per-tool prompts within a turn. They solve different problems, and reaching for the wrong one is the most common confusion of the bunch.
  • Running a workflow wide before testing it narrow. The cost is real. Run it on one directory or one narrow question first, read the per-agent spend, then decide.

FAQ

Is a Goal the same thing as a Workflow with one agent?

No. A goal is one agent working in your conversation, iterating until a separate model confirms a condition. A workflow is a script that runs agents in an isolated runtime and keeps their results out of your context. Different execution model, different cost profile, different strengths.

Do Goals and Workflows work in headless or scheduled runs?

Goals do, directly: claude -p "/goal CONDITION" runs the whole loop to completion in one invocation, which is how you put a goal in a script or a nightly job. Workflows also run under claude -p and the Agent SDK, where there is no one to prompt, so tool calls follow your configured permission rules without interactive confirmation.

How does a Goal decide it is finished?

After each turn, the condition and the conversation are sent to a small fast model (Haiku by default). It returns yes or no plus a short reason. A “no” feeds the reason back as guidance for the next turn; a “yes” clears the goal. The evaluator only judges what is already in the transcript, it does not run tools itself, which is why your condition has to be provable from Claude’s own output.

What does a Workflow cost compared to just asking Claude?

More, often meaningfully more, because it spawns many agents and they all consume tokens. Parallelism compresses wall-clock time, not total token spend. Test on a small slice first, watch the per-agent usage in /workflows, and route stages that do not need the strongest model to a cheaper one.

Can I read and edit the script Claude writes for a Workflow?

Yes. Every run writes its script to a file under your session directory, and you can read it, diff it against a previous run, edit it, and ask Claude to relaunch from your edited version. The orchestration is not a black box; it is code you own.

Which should a marketing team learn first?

Start with /deep-research to feel what a workflow does with zero setup, and start with one tightly scoped /goal (a “fix this page until it clears the bar” task) to feel the depth loop. From those two, the decision of which to reach for next becomes obvious: name a finish line and you want a goal; fan out across many pieces and you want a workflow.


The takeaway

Goals and Workflows are not rivals; they are a depth tool and a width tool. A goal points one agent at a finish line and trusts a separate model to call it done. A workflow writes the plan into a script and runs an army of agents against it, then hands you one trustworthy answer. The skill is not picking a favorite. It is reading the shape of the work, sequential with a finish line, or wide with independent parts, and reaching for the tool that matches. Build that instinct, wrap both in checkable guardrails, and you get the thing every operator actually wants from AI: more leverage, without losing the plot.



Ready to put this into action?

Picture of SHANE MCINTYRE

SHANE MCINTYRE

Founder & Executive with a Background in Marketing and Technology | Director of Growth Marketing.