prompt engineering automation
i use this for consultancy — any client prompt engineering task runs through this pipeline before delivery.
the problem
prompt engineering automation turns a plain-english client brief into a finished prompt deliverable, running it through a chain of ai stages with a human review gate at each one. the problem it solves: doing prompt engineering well — analysing the task, writing a spec, drafting and validating the prompts — takes hours per client, and there was no way to apply that senior-level process consistently across many clients without the quality drifting.
the approach
- 01 planner
the planner turns the client brief into a structured specification.
- it extracts audience, tone, constraints, success criteria and edge cases from a plain-english task.
- you review the spec on the dashboard and can add feedback before anything moves forward.
- 02 system prompt generator
the second stage builds the production system prompt.
- it writes a full instruction set for the ai model from the specification.
- your stage-one feedback is injected into its input, so the review actually changes the output.
- 03 task prompt generator
the third stage writes the end-user-facing prompt.
- it produces the task prompt or prompt chain, built from the spec and the system prompt.
- with this, the three core artefacts — spec, system prompt, task prompt — are complete.
- 04 validator
the validator is a formal quality gate.
- it checks every artefact against prompt-engineering principles and returns a verdict — ready to ship, or needs revision with specific notes.
- approved runs are saved to history; a run only leaves the pipeline once it passes.
engineering challenges
- state across an async pipeline
the pipeline isn't one call — it's four ai stages with a human pause between each.
- a run can sit for hours while the user reviews a stage, then resume.
- nothing can be held in memory across that gap, yet each stage needs every prior artefact when it resumes.
every run is persisted, so the pipeline is resumable rather than live.
- each run is a row in supabase holding the brief, the spec, both prompts and the validation report.
- a stage resumes by reading the row — the human pause costs nothing, because there's no live process to keep alive.
- making review actually change the output
a review gate is theatre if the feedback never reaches the model.
- letting a user approve a stage but feeding the next stage the un-amended output wastes the review entirely.
feedback from each stage is injected into the next stage's input.
- stage-one notes shape the system prompt; stage-two notes shape the task prompt, on down the chain.
- the human isn't rubber-stamping — their correction becomes part of what the next model sees.
- consistency across clients
the whole point is senior-level prompt engineering applied the same way every time.
- done by hand, quality drifts with attention, mood and time pressure.
the process is fixed in the pipeline, not in a person's head.
- every client brief runs the same four stages, against the same principles.
- the validator enforces a floor — a run can't be delivered until it passes the same formal check.
the workflow