EMPIRICAL STUDY • 32,400 ARCHITECTURAL DECISIONS ANALYZED

Vibe Coding
Kills Factory Software

AI builds factory software that looks production-ready. It isn't. Karpathy-style prompts make it safe to ship.

GUARDRAIL PROMPTS

Guardrail Prompt Catalog

The download is underway or complete depending on connection speed. Here's an explanation of each prompt.

Prompts Explanation Catalog v3

This catalog explains each of the 10 guardrail prompts in simple, non-jargon language.
Each explanation tells you:

  • What the prompt is for
  • Why it matters
  • How it works

These explanations are for humans. The actual prompt files are short and technical so they can be copied straight into an LLM.


Guardrail 01 — Never Accept a proposed Software Architecture That Skips Formal Safety Analysis

This prompt checks whether the AI recipe includes real safety engineering.
Many AI recipes talk about safety but skip the actual work — the formal safety methods called HAZOP (Hazard and Operability study), LOPA (Layer of Protection Analysis), and SIL (Safety Integrity Level). This prompt forces the reviewer to do the full safety analysis, map every hazard to the requirements, and prove that safety features are actually coded in real files. It refuses the recipe if the safety work is missing or fake. Its job is to stop dangerous designs from reaching the factory floor.


Guardrail 02 — Require Heavy Five-Axis Adversarial Review for Every LLM-Generated Recipe

This prompt makes sure every recipe is carefully checked before anyone trusts it.
It looks for five specific problems (the five axes):

  1. Cybersecurity-floor downgrade (weakening required security levels)
  2. Hallucinated vendor/part/standard/clause (made-up components or rules)
  3. Inappropriate technology for operator skill or facility size (wrong tools for the workers or factory)
  4. Over-engineering vs. operator FTE budget (too complicated for the number of people available)
  5. Personality drift (the model showing its usual biases)

The reviewer must find evidence or clearly say “no defect found” for each problem. Small or open-source models get extra attention because they often make more mistakes. This is a heavy review that applies to every recipe.


Guardrail 03 — Never Permit Cloud-Component Creep

This prompt checks that the recipe really works when the internet is down.
Many AI recipes say they are offline-capable but secretly depend on cloud services for login, updates, or data. This prompt forces the reviewer to find every hidden cloud connection and prove that every important function still works during an outage. It refuses any design that breaks when the factory loses internet.


Guardrail 04 — Treat PostgreSQL Consensus as a Hypothesis, Not Gospel

This prompt stops the team from blindly copying the most popular database choice.
Many AI recipes automatically pick PostgreSQL with TimescaleDB because it appears everywhere. This prompt treats that choice as a guess that still needs proof. The reviewer must compare it to at least two other databases and explain why it is the best fit for this specific factory use case and its Product Requirements Document (PRD).


Guardrail 05 — Mandate Edge-First, On-Premises Simplicity Unless Proven Otherwise

This prompt pushes for the simplest possible design that runs on the factory floor.
It requires the recipe to use local computers (edge devices) and on-premises servers first. Any cloud or complicated central system must be proven necessary. The goal is to keep the system simple, reliable, and independent of the internet.


Guardrail 06 — Standardize Identity, Encryption, and Network Segmentation

This prompt makes sure the recipe uses the same secure methods for who can log in, how data is protected, and how the network is divided.
It enforces standard rules so there are no weak spots or different security methods scattered around. The reviewer checks that identity, encryption, and network rules meet minimum industrial standards and are consistent everywhere.


Guardrail 07 — Require a Truly Offline-Capable Frontend

This prompt checks that the screens and user interface work completely without the internet.
Many AI recipes create nice-looking screens that secretly need cloud services for fonts, updates, or data. This prompt forces the reviewer to prove the entire user interface runs locally with no external dependencies.


Guardrail 08 — Independent Red-Team Review

This prompt requires an independent reviewer from a different company or AI family to attack, exploit, and harm the use case due to poor architecture in the recipe.
It stops the same team or same AI family from reviewing their own work (they often miss problems). The independent reviewer must check nine different attack angles:

  1. safety
  2. cybersecurity
  3. offline capability
  4. complexity
  5. database
  6. frontend
  7. PRD traceability
  8. hallucination
  9. regulatory traceability

The reviewer produces clear findings for each angle.


Guardrail 09 — Account for Stable Model Personality

This prompt studies how a particular AI model behaves when it creates recipes.
Every AI has its own “personality” — some always push cloud services, some skip safety steps, some add too much complexity. This prompt runs the same requirements through the same model several times and creates a personality profile so future recipes can be adjusted for that model’s weaknesses.


Guardrail 10 — Never Trust Self-Assessed Scores

This is the final Karpathy-style prompt to evaluate and grade the architectural recommendations.
It removes every confidence score, safety score, or practicality score the AI gave itself and replaces it with real evidence and human sign-off. It checks that all nine previous guardrails passed and that qualified people (not just the AI) approved each important gate. It stops the team from trusting the AI’s own “I did a great job” claims.


THE RESEARCH

Using the Research

The download is underway or complete depending on connection speed. Here's how to put the research to work.

How to use the research

Most teams read a paper like this, nod, and change nothing. This one is built to be used. Everything below turns the 216-recipe study into things you can do on your next AI-assisted design. One idea runs through all of it: AI proposes, independent review attacks, formal gates verify, and accountable humans decide.

It works for any high-consequence use case

The evidence came from factories, but the method has nothing to do with conveyors or chemical tanks. The same protocols — force real trade-offs, demand evidence over confident prose, attack the design with an independent reviewer, and gate it behind a human sign-off — apply anywhere a plausible-looking mistake can harm people, money, or trust.

If your work carries real consequences, this research is for you:

  • Healthcare — clinical safety, patient privacy, audit trails, device validation.
  • Finance — fraud controls, model-risk governance, transaction integrity, reporting.
  • Aerospace — certification evidence, redundancy, traceability, failure-mode analysis.
  • Energy and utilities — grid reliability, safety interlocks, cyber-physical attack paths.
  • Defense — mission assurance, secure supply chain, classified-data boundaries.
  • Logistics and public sector — continuity, data integrity, records, accountability.

Build your own short list of non-negotiable controls before you prompt — then hold the AI to it.

Start where you sit

You don't need to read all 42 pages first.

Everyone starts here: clean your PRD. Before any design work, have an independent LLM review your Product Requirements Document (PRD) and strip out vendor names, leading language, and any architectural choices you've quietly baked in. A biased PRD pre-decides the answer before the AI ever runs; a neutral one forces the AI to solve the real problem.

Then take your role's first step:

  • AI builder or architect — run your next AI-generated design through the 10 guardrail prompts before you ship it.
  • Safety or cybersecurity engineer — use the 13 acceptance gates as your sign-off checklist.
  • Auditor or regulator — run the paper's Standards Traceability and Regulatory Citation prompts to force every compliance claim back to a cited standard, a mapped control, and real evidence.
  • Executive or sponsor — adopt the paper's 8-stage AI-assisted architecture workflow as your governance gate, backed by a decision ledger that ties every accepted recommendation to an accountable human owner.

The core move: make the AI prove its work

Don't use AI to skip architecture. Use it to produce a fast first draft — then attack that draft until it earns trust. The study's eight-stage workflow does exactly that:

  1. Clean PRD — describe the problem with no vendor names or solution hints.
  2. Forcing functions — make the model choose (edge vs. cloud, simple vs. complex) and defend it.
  3. Deterministic generation — temperature 0, an anti-pattern blocklist, a strict output schema.
  4. Structured output — machine-readable fields, so omissions become visible.
  5. Independent red-team — a different model family attacks the design.
  6. Formal gates — safety, cybersecurity, offline, and operations checks must pass.
  7. Human approval — named, accountable experts sign off.
  8. Governed deployment — testing, rollback plans, incident drills, and a decision ledger.

Forcing functions are what make this method work. A forcing function is a requirement the model can't satisfy with bland "secure, scalable, reliable" boilerplate — it has to choose between real alternatives (edge vs. cloud, modular monolith vs. microservices, formal safety analysis vs. a self-assigned safety score) and defend the trade-off, which is what turns a generic recipe into a thoughtful, committed architecture. To design forcing functions for your own use case, see Section 8, "Architectural Trade-Off Forcing Functions," for worked examples, then use the Trade-Off Forcing Prompt (Section 16.2) to generate your own.

Determinism comes from a hardened control prompt. Every generation in the study was prefixed with Block A v3, a Karpathy-hardened control prompt that pinned temperature to zero, banned fabricated values and buzzwords, and blocked known anti-patterns (Section 10). Removing randomness and "creativity" is what makes outputs comparable run-to-run — though the paper is candid that this reduced failures without eliminating them, which is exactly why the later review stages exist.

Structured output turns prose into gradable data. The study required every recipe to be valid JavaScript Object Notation (JSON) matching a fixed schema — roughly 150 decision fields across 23 sections (Section 9). Machine-readable fields make grading systematic and omissions impossible to hide: a recipe could claim a safety score of 6 while the same JSON showed no formal hazard analysis was ever performed.

The red-team's job is to break the design, not bless it. A reviewer from a different model family runs the paper's Independent Red-Team Prompt (Section 16.5; a fuller audit version is in Appendix C) to exploit the recipe, hunt for failure points, and harden the architecture — because the model that wrote the design will never attack its own assumptions. Cross-model review caught defects the generators missed, from cloud creep to missing safety analysis.

Put it to work this week

Small, concrete steps beat good intentions:

  • Grab the 10 guardrail prompts (the other button on this page) and run them on your next design — each one is a countermeasure to a failure the study actually measured (Section 13).
  • Walk the 13 acceptance gates before anyone signs off — every gate names both its pass condition and the accountable human who owns it (Section 14).
  • Profile your model before you trust it. Each model showed a stable, repeatable bias — GPT was lock-in-paranoid, Qwen a baroque over-engineer with high hallucination risk (Section 12). Run the same PRD three to five times, then classify it with the four-question personality probe and the Guardrail 9 classification prompt.
  • Write one clean-room PRD. A biased PRD manufactures biased results, so the study ran a three-model clean-room process — one model drafts, a second optimizes, and an independent auditor strips every vendor name and leading hint (Section 7).

Ask sharper questions

The single most useful habit in the paper is adversarial specificity (Section 15): vague questions get plausible answers, while specific ones force the gaps into the open.

  • Instead of "Is this safe?" → "List every hazard, its required Safety Integrity Level (SIL) rating, and the compensating controls."
  • Instead of "Is this secure?" → "Map each International Electrotechnical Commission (IEC) 62443 control to evidence in the design."
  • Instead of "Will it work offline?" → "Simulate a complete network outage and show every critical workflow still working."

You don't have to write these from scratch — the paper's cross-industry prompt library (Section 16) hands you eleven ready-to-paste prompts, from worst-case-scenario simulation to standards traceability.

The one thing to remember

Plausibility is not readiness. Ignore the AI's self-assessed scores. No matter how expert the output sounds, a qualified human owns every high-consequence decision.

4 Troubling Results

0 / 216
Safety Omissions
Zero recipes included the formal safety engineering (HAZOP, LOPA, and SIL) that was explicitly required — incorrectly omitted and ignored.
79%
Wrong Architecture
Recipes introduced cloud dependencies 79% of the time, despite explicit offline requirements.
89%
Cybersecurity Failure
One LLM model weakened required cybersecurity standards in 89% of its outputs.
73%
Bad Hallucinations
Auditors captured Hallucinations across every LLM with the worse performing LLM generating 73% erroneous information.

In a factory, steel mill, hospital, or any consequential use case, these 4 gaps are big problems. The research shows where they appear and how to fix them.

12 PRDs × 6 LLMs × 3 runs • 32,400 architectural decisions • 213 red-team reviews

Who this is for

The doctrine was built for software whose failures have real-world consequences. It will be most useful if you recognize yourself below.

FOR YOU IF
  • You ship code that ends up running factories, refineries, hospitals, plants, or power grids.
  • Your architecture has to pass a safety review — not just a code review.
  • You use AI assistants and worry, correctly, about what they're quietly skipping.
  • You're a cybersecurity lead, safety engineer, auditor, or regulator trying to make AI-assisted design accountable.
NOT FOR YOU IF
  • You build consumer apps where bugs are inconvenient, not dangerous.
  • You're already convinced AI doesn't need guardrails in safety-critical work.
  • You're looking for generic prompt-engineering tips — this is empirical doctrine, not advice.
  • You want a how-to-build-with-AI tutorial.

The Research — The Paper at a Glance

In a controlled study, six leading AI models (GPT-5.5, Claude Opus 4.7, Gemini 2.5 Pro, Grok 4.20, DeepSeek V4 Pro, and Qwen 3 30B-A3B) were given the same factory automation requirements that a real engineering team would use. They produced 216 detailed architecture recipes.The results were troubling:

  • Zero recipes included the formal safety engineering — HAZOP (Hazard and Operability study), LOPA (Layer of Protection Analysis), and SIL (Safety Integrity Level) — that was required in the written specification.
  • 79% added cloud dependencies despite clear requirements for fully offline systems.
  • One model weakened required cybersecurity standards in 89% of its outputs.

This research introduces ten practical guardrails and acceptance gates, along with ready-to-use copy-paste prompts, to make AI-assisted architecture reliable in high-consequence environments.

GitHub RepoDownload Paper ↓

216 architecture recipes • 6 frontier LLMs • 3 runs each • 213 red-team reviews

HAZOP Hazard and Operability study · LOPA Layer of Protection Analysis · SIL Safety Integrity Level

Plausibility Is Not Readiness

The models spoke the language of industrial automation while quietly skipping the actual engineering work that keeps factories safe, cyber-secure, and reliable.

The 12 Factory PRDs

All six models received the same twelve product requirements documents. Each describes a real factory problem.

  1. Receiving Quality & Supplier Scorecards. Automatically inspect incoming materials, accept or reject batches, and track supplier reliability.
  2. Machine Downtime Tracking & Andon. Detect unexpected machine stops, alert the right people, and record causes for improvement.
  3. Energy Consumption Monitoring. Measure electricity and compressed-air usage so the factory can reduce waste and cost.
  4. Predictive Maintenance for Critical Equipment. Use sensor data to predict failures before machines break down during production.
  5. Part Genealogy & Traceability. Record each part’s manufacturing history so defects can be traced quickly.
  6. In-Process Quality Inspection. Check part quality during production instead of waiting until the end.
  7. Work-in-Process (WIP) Tracking. Know where every batch or part is located inside the factory.
  8. Overall Equipment Effectiveness (OEE) Dashboard. Show managers and operators how efficiently equipment is running.
  9. Brownfield Legacy PLC & SCADA Integration. Connect new software to old control systems without replacing everything.
  10. Electronic Batch Records for Compliance. Create digital records that satisfy auditors in regulated production environments.
  11. Chemical Blending Process Control. Control chemical mixing according to recipes while meeting safety and quality rules.
  12. Aerospace Precision Machining Data Collection. Capture precise machine data for quality reporting and regulatory compliance.
What AI delivered versus what industry actually requires

The Ten Industrial AI Guardrails

Each guardrail is a Karpathy-upgraded copy-paste prompt reverse-engineered from a repeatable failure pattern in the 216-recipe corpus. Click View full prompt to read it, download as .md, or copy to clipboard.

1

Never Accept a proposed Software Architecture That Skips Formal Safety Analysis

Zero of 216 recipes performed HAZOP, LOPA, or SIL — even on safety-critical equipment.

2

Never Use Small Open-Weights Models Without Heavy Review

Qwen downgraded security in 89% of outputs and produced 73% of all hallucinations.

3

Never Permit Cloud-Component Creep

171 of 216 recipes added cloud dependencies despite explicit offline requirements.

4

Treat PostgreSQL Consensus as a Hypothesis, Not Gospel

PostgreSQL + TimescaleDB appeared in 215/216 recipes. Strong consensus ≠ correctness.

5

Mandate Edge-First, On-Premises Simplicity Unless Proven Otherwise

Complex distributed systems were repeatedly proposed for small factories with one IT person.

6

Standardize Identity, Encryption, and Network Segmentation

Models used vague “secure by design” language without enforceable controls.

7

Require a Truly Offline-Capable Frontend

Modern PWAs often failed basic shop-floor outage and glove-use tests.

8

Use an Independent Seventh LLM for Red-Team Review

The generator cannot reliably attack its own assumptions and blind spots.

9

Account for Stable Model Personality

Each LLM showed consistent, reproducible biases across all 12 PRDs and 3 runs.

10

Never Trust Self-Assessed Scores

Higher confidence scores often predicted more external criticism, not higher quality.

All prompts are also available in the repository under /prompts/