Guiding the agent
When a flow deviates, Reflow’s agent decides what to do from the context you give it. The more intent it has, the better its judgment. There are four layers, from most to least specific.
1. Steps
Section titled “1. Steps”The numbered lines under ## Steps are ground truth for what the user
does. They’re plain language; Reflow compiles them to Playwright and caches
the compilation. When a compiled action stops matching the page, that’s a
deviation, not a verdict — the agent’s first question is “did the app
change or did the app break?”, answered against your step’s wording.
Write steps as observable actions and outcomes:
## Steps1. Add the first product on the page to the cart2. Check out with the standard test card3. The confirmation page shows an order number and the correct total2. Flow prose
Section titled “2. Flow prose”Everything before ## Steps is intent. Write it like you’d brief a QA
contractor:
A signed-in user buys a single item. Discount codes are out ofscope for this flow.The agent uses prose to distinguish a redesigned checkout (adapt and carry on) from a checkout that no longer completes (fail: the product is broken).
3. Expectations — ## Always true
Section titled “3. Expectations — ## Always true”Assertions about important state, checked throughout the run, each verified semantically against the live page:
## Always true- the cart badge shows the running item count- prices never render as NaN or $0.00- no error banners appear at any pointUse them for what matters and what code asserts poorly: data correctness, layout sanity, “nothing looks broken”. When the app intentionally changes, expectation edits come back to your PR as a plain-language diff for approval.
4. Context packs
Section titled “4. Context packs”Repo-level context in .reflow/context/*.md is loaded into every run on that
repo. Good candidates:
- Design style guide — spacing rules, color palette, typography. Lets the agent flag “this modal ignores the design system” during visual checks.
- Domain glossary — what a “workspace”, “seat”, or “order” means in your product.
- Known quirks — “the staging payment provider takes ~10s on first call”.
.reflow/ context/ design-system.md glossary.md flows/ checkout-happy-path.mdContext packs are plain markdown. Budget note: they consume input tokens on every model call in the run — keep them tight, link out for detail the agent doesn’t need.
Precedence
Section titled “Precedence”Steps > flow prose > expectations > context packs. When sources conflict, the more specific one wins, and the run report says which context drove each decision.