The A.I. Governance Gap Nobody Is Preparing For

An opinion piece on what happens the day after you replace a SaaS tool with an internal AI build, and the state machine framework I built to close the gap.

Gavin Fitzpatrick · April 2026

I built two application prototypes in a single sprint, one with Claude Code and one with Lovable: phase-gated risk lifecycle, control library, treatment workflows, policy governance, executive dashboards. Zero manual code. One of those prototypes became the governance platform I will describe in this article. That sprint was the top of the iceberg.

The longer and far less visible work was everything that followed. Documentation, security control builds, AppSec reviews, shared responsibility boundary definition, prompt observability pipeline design, RBAC validation against the specification, and lifecycle governance for the tools themselves. That governance work took significantly longer than the build, and it is the work that most teams skip.

This matters because every SaaS tool you have ever used came bundled with operational responsibilities you never had to think about: SDLC discipline, AppSec testing, a support model, a versioning strategy, monitoring infrastructure, and someone else's incident response plan. When you build internally, all of those transfer to you on the day the first user is onboarded.

The most important question before any internal AI tool ships is not whether it works. It is who will own it in six months, who will be accountable for its operation, its security posture, and its eventual retirement. That ownership has to be a structural gate condition enforced at the process level, because engineers naturally optimise for creation and organisations reward shipping. Nothing in the default incentive structure rewards the person who quietly maintains a production system and ensures the prompt logs are flowing into the SIEM. Relying on individual diligence to fill that gap does not scale.

Why This Is Coming At You Faster Than You Think

The market shift creating this pressure is established, even if its endpoints are still being argued. Retool's 2026 Build vs. Buy Shift Report, based on a survey of 817 builders, found that 35% of teams have already replaced at least one SaaS tool with a custom build, and 78% expect to build more custom internal tools in 2026. Public software valuations have repriced sharply in parallel. SaaStr's index work shows the forward P/E multiple for application software has compressed from 84.1x at the 2020 to 2022 peak to 22.7x by March 2026.

Regulated enterprises are not waiting for the thesis to be proven. Andrew Pade, General Manager of Cyber Defence Operations at Commonwealth Bank of Australia, told the Gartner Security and Risk Management Summit in Sydney in March 2026 that his team built their own agentic AI threat hunting tools because vendors could not move at the speed his bank needed against emerging threats. JPMorgan Chase built its LLM Suite entirely in-house for data privacy and governance reasons. Goldman Sachs embedded Anthropic engineers for six months to co-build AI agents for transaction reconciliation and client onboarding.

The category-level read is more nuanced than headlines suggest. SaaS platforms with deep compliance certification stacks, genuine network effects, and operational infrastructure built over years will adapt their pricing, embed agentic capabilities, and endure. The economics of building from scratch do not favour replacing a system of record with a weekend project. What is changing is the bar for purchased software and the share of work that internal builds can credibly absorb. The categories that are shifting are workflow tools, internal security portals, risk registers, and anything currently living in a ticket queue. If the value is configuration and logic rather than infrastructure, it can be built.

The distinction matters at the infrastructure layer, and it is not subtle. Nobody is replacing S3 or Snowflake with an agent-built application. The engineering depth, compliance attestation, and operational reliability behind those platforms took years and significant capital to build. What teams will do -- and are already doing -- is build middleware layers on top of them. A bespoke ingestion pipeline, a custom access governance layer, a domain-specific query interface. The backend stays. The application logic that sits above it is fair game.

The harder question, and the one most analyst coverage has not answered, is what happens the day after you deploy.

The Iceberg: What You Build in a Sprint vs What You Own Forever

Even before agentic AI accelerated this pattern, I watched the same failure mode play out repeatedly across enterprise environments. An internally built tool would quietly become business-critical. Six months later nobody could identify a named owner, there was no runbook or monitoring in place, and the person who built it had long since moved on. Agentic AI has not created this problem. It has dramatically compressed the timeline in which it surfaces.

AI tools also introduce a genuinely new operational surface: prompt observability. The inputs users send to the model and the outputs it returns are both attack vectors and detection opportunities. Prompt injection, data exfiltration via crafted outputs, and anomalous usage patterns will not appear in application error logs. They will appear, or critically fail to appear, in prompt telemetry. Treating that telemetry as optional is the same category of mistake as deploying a web application without access logging.

The governance framework I built addresses these risks through a lifecycle model with enforced phase gates covering ideation, build, pre-production validation, production, maintenance, and deprecation. At ideation, the gate requires a named owner before a single prompt is written. At pre-production, the gate requires confirmed prompt telemetry in the SIEM before any user is onboarded. At maintenance, the gate requires quarterly kill-criteria assessment so that tools are retired deliberately rather than abandoned. The full lifecycle specification is in the repository linked below.

Specification-Driven, Not Vibe-Coded

During one of my builds, the agent removed a governance gate to resolve a frontend rendering conflict. The component rendered correctly, the invariant was absent from the code, and nothing broke visibly. If I had not been checking the specification against the output at each stage, that platform would have shipped without enforcement at the transition that mattered most.

That experience crystallised what I now consider the most important lesson of agentic development. When the AI produces something broken, the correct response is not to debug the generated code but to identify the specification gap that permitted the failure, update the specification, and rebuild from it. Retool's data shows that 60% of enterprise builders have created tools outside IT oversight in the past year. The capability is clearly established. The governance discipline that production systems require is not.

Specification-Driven Agentic Development: From codified governance rules to validated, production-ready tooling

The methodology I developed follows a strict sequence. The governance framework is first codified into machine-readable rules covering object states, entity relationships, scoring formulae, and hard invariants. That rule set produces a specification package per module before any code is generated. The build then follows a prompt cycle where each prompt is self-contained with schema definitions, business logic, and an explicit Definition of Done. Pass means pinning a stable version. Fail means fixing the specification and rebuilding. The specification documents serve a dual purpose: they drive the build and they re-ground the agent when context window degrades on large codebases. The full methodology, including the prompt library and methodology guides, is in the repository.

Governance Is a State Machine

GRC tools fail when they encode the wrong mental model. Governance is a dynamic, interconnected state machine where a control failure alters the exposure profile of every risk linked to it, and a missed SLA should trigger a formal escalation path rather than logging a timestamp. Forrester's "Grad School Era" analysis argues that GRC technology needs to finally earn its keep as a workhorse rather than a reporting layer. I reached the same conclusion working on the ground and stopped writing policy documents in favour of writing system invariants. Constraints that enforce what the system can do at the application layer, not what a policy says it should do.

Risk Lifecycle State Machine: Phase-gated enforcement with cross-entity propagation

The platform I built enforces this in the application logic, not in documentation. State transitions require defined conditions to be met before they are permitted. Each decision type carries specific preconditions that the system validates rather than assumes. When one component fails, the effects propagate automatically to anything linked to it. Evidence has to be confirmed before a score updates. None of this is held together by a policy document or a manual review step. It is built into how the system behaves. The same specification drives deployment onto either agentic SaaS platforms or self-hosted infrastructure. The specification is portable and the generated code is disposable. The full data model, invariants catalogue, and deployment architecture for both paths are in the repository.

What Comes Next If You Build From the Spec Down

The shift underway favours teams with the right internal skills. Companies can now build governance tooling that maps precisely to their frameworks rather than configuring a vendor's approximation. This is not an argument against SaaS as a category. Mature platforms with deep compliance certifications, genuine integrations, and operational infrastructure built over years will adapt and endure. The economics of building from scratch do not erase the value of those moats.

What changes is the bar. When a team can deploy from a specification in weeks, vendors compete on capability rather than on lock-in. The same instinct that makes a senior practitioner effective, clear scope, defined boundaries, modular thinking, precise language, is exactly what makes agentic development work. Most developers are learning to think like architects. If you already do, you have a head start.

The operational complexity does not disappear when the SaaS vendor does. It transfers. Building it, owning it, securing it, and eventually retiring it responsibly are categories of work most security teams have not yet planned for. They will need to.

What domain in your security programme has the widest gap between policy intent and tool enforcement? That is where this starts.

The Repository

The full governance framework is open-sourced at gfitzp79/state-machine-governance under CC BY 4.0. It contains the codified rules engine, full relational data model, state transitions and phase gates, invariants catalogue, scoring model, 9-role RBAC specification, deployment architecture for both paths, shared responsibility model, AI tool lifecycle governance specification, SaaS exit governance specification, and the complete specification-driven development methodology. No source code, only specification.

Sources

Retool, "The Build vs. Buy Shift: How Vibe Coding and Shadow IT Have Reshaped Enterprise Software," 2026 Report (817 respondents). Published February 2026.
SaaStr, "The SaaS Rout of 2026" and "The Leading Public Software Companies Are Now Down -50% in the Last 6 Months." Forward P/E figures for application software, May 2020 to March 2026.
The Register, "Bank built its own threat hunting agent because vendors can't keep pace with new threats," 17 March 2026, reporting Andrew Pade's address at the Gartner Security and Risk Management Summit, Sydney.
Forrester, "GRC Enters Its Grad School Era," 2025 analysis.

Architectural patterns described in this article were developed in a personal R&D environment for educational purposes. They do not represent the systems, roadmap, or official stance of any current or former employer.