Handbook / Trust & Governance

Cohorte practice · trust & governance

Before you let AI act on its own, you need more than a good model.

For banks, insurers, hospitals, ministries and regulators, the places where "usually right" is not good enough. This page starts with the problem in plain terms, then shows the discipline we teach, the architecture we have shipped to enforce it, and our research behind both. For everyone who has to build AI, ship it, sign off on it, and answer for it.

First, the problem, in plain terms.

For a few years, AI mostly talked. You asked, it answered, and if the answer was wrong you saw it and moved on. That era is ending. The new systems do not answer, they act: they send the email, update the record, move the money, reply to the patient. And an action you never saw is an action you cannot catch.

You never see this in the shiny demo. Because in the shiny demo it is always the happy path: a clean question, a known answer, a controlled room. Your work is none of those things. In production these systems are usually right, and "usually" is a lovely word in a chatbot and a dangerous one in a bank. Something right 95% of the time, taking a thousand actions a day, is quietly wrong fifty times a day, in your name. In a hospital, an insurer or a ministry, "usually right" is not a release criterion.

So the useful question is not "is the AI clever?" It is "what happens the moment it is wrong, and who is accountable when it is?" Most AI projects never ask it. This page is the answer.

What "trust" actually means here.

Trust is not a property of the model. It is a property of the system you build around it. A model can score brilliantly on a public benchmark and still be a stranger to your reality: your data, your workflows, your context, your clients, your money, your reputation.

The benchmark answers "is the model capable?" It never answers the only question that matters to you: can I trust this system, on my data, for my work, with my clients and my money on the line?

A 94%-accurate data manager is not a 94%-trustworthy system; the benchmark number is the start of the conversation, not the end. The real work is scoping, verification, governance and monitoring, in that order, run as a discipline that survives the next change of model, and enforced by a layer that sits between every agent and everything that matters.

Cohorte builds this layer in the open, and trains the people who run it. What follows is both, in plain terms first, with the proof underneath.

The discipline: four layers.

Most "AI governance" programmes start at the wrong layer. They begin with policy, then add evaluation, then attempt to scope use cases later. The result is policy applied to the wrong systems, and verification applied to systems that should never have been built. The layers run the other way.

Scoping what gets built, and what does not

Before any system exists, the decision is which workflows are candidates and which are not. Not every process should be automated, not every output should be generated. The scoping layer produces a short list of candidate systems with a clear owner, a written "no list", and the evidence that automating them is the right move at all. You cannot automate a mess.

Verification prove, then automate

A reliability level, not a vibes check. Calibrated verification: self-consistency sampling plus conformal calibration, producing a distribution-free guarantee that a black-box system meets a stated reliability bar on a stated task, without access to ground truth. The system earns the right to deploy. It is not granted that right by the slide deck.

Governance registry, gates, accountability

Every agent has an owner, a registered purpose, a stated reliability bar, an escalation path and a defined retirement condition. Gates at the right places, not at the end of the pipeline where the human cannot act. Mind-in-the-loop is not the same as human-in-the-loop. The governance layer is where accountability becomes operational, not theatrical.

Monitoring drift, incidents, regulatory output

The system lives. Distribution shifts, prompts mutate, dependencies update, regulation evolves. Monitoring detects when verification stops being valid, escalates incidents the registry can act on, and produces the regulator-facing report as a natural output of how the system is run, not as a quarterly compliance exercise.

The platform that enforces it.

Think of it as one security checkpoint that every AI action has to pass through, the way everyone in a building badges through the same door whatever laptop they carry in. The AI tools can be anything and can change weekly; the checkpoint never moves, and nothing reaches your data or takes an action without clearing it. In the architecture that checkpoint is a permanent layer, and everything above and below it is replaceable. Local AI systems reach organisational data and actions only through it. Nothing bypasses it.

Layer 4

Local AI systems

Any agent framework, any model, local tools via MCP. Employees' tools stay their business. The gateway governs only what they do with organisational data and actions.

connected through one contract · the Platform Protocol (~15 API calls)

Layer 3

Trust & governance gateway

The permanent infrastructure. Every cross-boundary request is mediated here, by six composable services plus reliability certification.

Auth & permissions8-step authorization, agent ⊂ user, three-tier approval

GuardrailsYAML policy on input, output, action, tool-call

Action controlrisk-classified, approval state machine

Context routerintent → sources, permission-filtered, budgeted

LLM gatewayDLP, metering, cost & budget enforcement

Monitor & auditanomaly detection, kill-switch, immutable log

Layer 2

Organisational agents & intelligence

Certified, process-driven agents (company assets, not personal assistants) and domain intelligence that compounds. Month one it answers; month twelve it anticipates.

Layer 1

Data & context

Git-versioned context, structured databases, external systems. Reached only through a connector framework, never directly. Orchestrated declaratively (Context Kubernetes), with freshness guaranteed, not hoped for.

A governed request, end to end.

This is what "the gateway mediates everything" means in practice. One request from one agent, every checkpoint it passes, and what each one decides. A non-technical reader can follow the left column; a security reviewer can read the right.

Authorize

Is this agent, acting for this user, allowed this action right now? Agent permissions are a strict subset of the user's.

allow / deny / approve

Input guardrail

Scan the prompt for injection, secrets, and policy violations before anything runs. Deterministic YAML rules.

allow / redact / block

Context route

Resolve intent to sources, filter to what the user may see, rank, and fit the token budget. Pre-retrieval filtering is the primary control.

scoped context

Action control

Classify the action by risk. Autonomous acts run; consequential ones require approval at the right tier.

tier 1 / 2 / 3

Approval

Tier 2 asks the user in-app. Tier 3 requires out-of-band re-authentication the agent cannot read, intercept, or forge.

human decides

Output guardrail

Scan the generated result for leakage and sensitive content before it leaves the boundary.

allow / redact

Monitor & audit

Record every decision to an append-only log. Update metrics, detect anomalies, trip the kill-switch on violation.

immutable trail

Start with our open-source libraries.

Everything in the diagram above is code you can read, not a roadmap. The gateway is open-source and versioned. Each component is tested and benchmarked on its own, and the permission model is formally verified. You can adopt the pieces as separate libraries or run them as one integrated middleware. The policies are plain YAML in git, and the audit trail is append-only. Engineers can inspect it before they rely on it, and so can auditors.

guardrails

Declarative policy engine for input, output, action and tool-call. Keyword, regex and PII matchers; rate limits; framework adapters.

~0.005 ms/eval · 1,286 tests

context-router

Permission-filtered, multi-source context retrieval with token budgeting. SSRF and path-traversal hardened.

827 tests · 0.6 ms keyword routing

agent-auth

Eight-step authorization, hierarchical roles, agent-to-agent rules, delegations, three-tier approval. JSONL audit.

<1 ms/decision · 44 tests

agent-monitor

Runtime observability: rolling metrics, z-score anomaly detection, instant kill-switch, SOC 2 / GDPR export.

~0.1 ms/event · 44 tests

context-kubernetes

Declarative knowledge orchestration with a reconciliation loop and a formally specified permission model.

TLA+ verified · 4.6M states, 0 violations

trustgate

Reliability certification and the runtime deployment gate: self-consistency sampling plus conformal calibration.

Apache-2.0 · on PyPI

Six repositories, over two thousand tests between them, every policy auditable in git, every decision on an append-only trail. The book that ties them together, The Enterprise Agentic Platform, is published.

Verification: how a system earns the right to deploy.

How do you know a system is good enough to trust with real work? You don't take the vendor's word and you don't trust the demo. You measure it, the way a new drug is measured before it is approved: run it many times, on real cases, and attach a number to how often it is right, with a guarantee. Below the bar, it does not ship.

That number is the reliability level. Under the hood: sample the system several times on a task, group the answers that mean the same thing, calibrate against a held-out set, and you get the confidence at which the top answer is trustworthy, with a finite-sample, distribution-free guarantee that holds regardless of the model's hidden biases. It needs no ground truth at run time and no access to the model's internals. The method is published, and the gate is shipped, open-source, as trustgate.

94.6%

certified reliability for GPT-4.1 on GSM8K, with a single-answer set (one worked example)

97.6%

coverage on TruthfulQA at the 90% target, never under-covering

≥93%

conditional coverage on solvable items, across every benchmark tested

45–52%

cost cut by certified sequential stopping, with no loss of guarantee

Worked examples from the published study (GPT-4.1 shown). The method itself is model-agnostic; the figures are a snapshot of the frontier models tested at the time, not a fixed claim about any one model.

Reliability is only half the question. The other half is cost.

A system can be reliable and still be a disaster, if it is reliable at the wrong price. The real question is never just "does it work?" but "what reliability, at what cost, for this workflow?" A pilot that answers beautifully and costs a dollar a request will break the budget the week it succeeds.

This is the bill almost nobody forecasts. The price per token has fallen by roughly three quarters, which sounds like relief. But an agent does not make one call, it makes hundreds: it loops, retries, reads context, calls tools, checks its own work. Consumption can rise two-hundred-fold. Multiply a token that is 75% cheaper by hundreds of times more tokens, and the bill still climbs an order of magnitude or more. Teams that expected a modest increase are opening invoices fifty or sixty times larger, and some are quietly discovering the AI cost more than the people it was meant to help.

So cost is not an afterthought to governance, it is part of it. The LLM gateway in the architecture above meters every call, attributes it to an owner, and enforces a budget, because a workflow's reliability target and its cost ceiling are decided together, not discovered on the invoice. Performance-against-cost belongs beside reliability, security and sovereignty as a first-class question, and it is the subject of its own briefing.

What if your AI supply gets cut off?

There is a quieter risk in building your business on someone else's API: you are one decision away from losing it. A lab retires the model you depend on. A government restricts where a model may run, as happened when the US blocked Fable 5 from rolling out beyond its borders. A provider changes its terms, its price, or its access overnight. If your whole organisation stops working the day a single vendor in a single country decides you no longer get tokens, that is not a strategy. It is an exposure.

The answer is not paranoia, it is portability. Keep the model behind an interface you own, so you can host your own where it matters, run open weights, and swap providers without rewriting the system or re-earning trust. Sovereignty here is not a flag-waving slogan; it is the plain ability to keep operating when someone upstream changes their mind. Briefing 05 maps it dimension by dimension.

Can an agent be talked into breaking your rules?

This is the security layer of trust, and most of what is said about it is guesswork. Everyone worries about AI being "hacked." Far fewer have tested what genuinely works. We did, about 10,000 times, across seven frontier models, in real sandboxed machines with real tools, and published the results. We tried twelve different ways to talk an agent into breaking its own rules. Nine of them did nothing at all.

One worked alarmingly well, and it is not the one people expect. You don't jailbreak the agent, you change the story. Tell it the task is a puzzle or a security challenge, and it quietly reinterprets breaking the rules as winning the game, exploiting the system 32–40% of the time on some models, even when it was explicitly told to follow every rule. The exact same prompt did nothing on others. The agent never "decides" to disobey; it convinces itself the exploit is the task.

The lesson is the uncomfortable one. Telling a model to "be careful" does not work. Constraining what it can reach, and checking what it actually does, is the only thing that does. That is why governance has to live in the architecture, not in a prompt.

Where the large platforms stop.

The big platforms all let you build agents. The question a CISO should actually ask is narrower, and it has two parts. Can an agent ever do something its own user is not allowed to do? And when an action is risky, can a human always stop it, in a way the agent cannot fake or skip? Those two guarantees, an agent's powers staying inside its user's, and approvals it cannot forge, are the line between a governed agent and a liability. We formalised them, proved them with a model checker, and checked the field. The table below is not marketing; it is from the published permission-model paper.

Architecturally enforced	Agent ⊂ user permissions	Out-of-band strong approval	Formally verified
Microsoft Copilot Studio	partial	no	no
Salesforce Agentforce	partial	no	no
AWS Bedrock Agents	partial	no	no
Google Vertex AI Agents	partial	no	no
TheAIOS control plane	yes	yes	yes · TLA+

In the same framework, no governance blocks 0 of 5 attack scenarios, basic role-based access blocks 4, and the full control plane blocks all 5, cutting cross-domain data leakage from 26% to 12%. The proof covers 4.6 million states with zero invariant violations.

The briefings.

Each briefing is a standalone A4 document built for forwarding to procurement, security, and the relevant leader on the buying side. They reference each other; none requires the others to make sense. The CISO & vendor briefing sits next to them as the security-and-procurement companion.

Briefing 01 · A4

The operating model for trustworthy AI

The umbrella thesis. Why "deploy first, govern later" fails in regulated environments, the four-layer discipline, and the four-layer architecture that enforces it, with the request flow drawn end to end.

For CDO, Chief AI Officer, Head of AI Ops

Briefing 02 · A4

Verification & evaluation

The reliability-level method, drawn as a pipeline: sample, canonicalise, calibrate, certify, gate. The deployment gate, the reviewer's screen, the calibration cadence, and the measured results on five public benchmarks.

For Head of AI Ops, technical reviewers, audit, model risk

Briefing 03 · A4

Agent governance in production

Agent passport, registry, gates, the three-tier approval model, runtime monitoring and incident response, with the layered defence drawn against the measured exploitation surface. Six open-source repositories named.

For CISO, Head of AI Ops, Head of Engineering

Briefing 04 · A4

Risk & compliance mapping

EU AI Act, ISO/IEC 42001, NIST AI RMF, SR 11-7, PRA SS1/23, DORA. Each clause mapped onto the four layers, with the compliance-evidence flow showing which artefact answers which regulator, and the cross-framework matrix.

For Head of Risk, Compliance, Procurement, vendor risk

Briefing 05 · A4

EU AI sovereignty

The four dimensions, decomposed. The five gradients of sovereign cloud, costed. The portability architecture drawn as a model-swap flow. The honest take on European LLMs in 2026, with the constraint that picks each position.

For public-sector CDO, EU banks, ministries, defence-adjacent

Briefing 06 · A4

Cost & FinOps for agentic AI

Why agentic bills explode even as tokens get cheaper, where the money goes, and how to govern performance against cost: the cost anatomy of a request, the meter-attribute-enforce control plane, reliability-aware model routing, and the cost levers.

For CFO, CIO, Head of AI, FinOps, procurement

Briefing 07 · A4

Data & context: the compounding moat

Why the model is rented and the context compounds. Context Kubernetes drawn as a declarative architecture, the reconciliation loop that guarantees freshness, permission-correct retrieval, and the compounding ladder from answering questions to anticipating them.

For CDO, Chief Data Officer, Head of AI, CTO

Briefing 08 · A4

Human oversight: the operator

The automation paradox, and how to beat it. Oversight theatre versus real oversight, routing human attention by calibrated confidence, the approval an agent cannot forge or skip, and the verification skill that decides whether any of it works.

For COO, Head of AI Ops, transformation lead, risk

Briefing 09 · A4

Build vs buy: vendor independence

Rent the commodity, own the moat. The own-versus-rent decision, the five parts of a switching cost, the portability architecture that keeps the model swappable, and where the major agent platforms make your differentiating layer theirs.

For CTO, CIO, Head of Architecture, CDO, procurement

Companion · CISO & vendor briefing

CISO & vendor briefing

The document Cohorte hands to the security team and procurement on first contact. Company posture, data handling, IP, sub-processors, GDPR/DPA/SCC framework, regulatory references. Sits next to the briefings.

For CISO, vendor risk, procurement, legal

Why this practice exists.

The moat is built, not borrowed.

Cohorte's founder built the operating model behind the PwC AI Factory: 60+ AI systems in production, 4,000+ Copilot users, +80% adoption in six months. The same person shipped the open-source control plane on this page and published the research underneath it, the reliability-certification method, the formally verified permission model, and the 10,000-trial exploitation study.

This is not a consultancy that retro-fitted itself into AI governance when the topic became fashionable. The frameworks taught in the bootcamp, and the code that enforces them, existed before the trend. And the deepest moat compounds: an organisation that builds this layer learns its own patterns. Month one the system answers; month twelve it anticipates. No competitor can buy twelve months of your context.

What this practice is not.

It is not a compliance checklist with a Cohorte logo at the top. It is not a managed service that takes governance off your team's hands. It is not a vendor-agnostic policy document that any LLM could have written.

Cohorte teaches the operating model and trains the people who will run it, on the same open components you can read, fork and deploy yourself. The artefact you leave with is the discipline and the architecture, not a subscription. If you want a vendor to own your governance for you, the right answer is not Cohorte, and we will say so on the first call.

How this connects to the programs.

Trust and governance is the content. The four Cohorte programs are the installation mechanism.

Program	Where the practice lives in it
The Pilot · 4 weeks · €8K-€12K	One scoped workflow taken to proof: a working, verified prototype with the reliability report behind it. Establishes the bar before any rollout.
Team Bootcamp · 12 weeks · €4,200/seat	Verification (weeks 5-7), trust engineered (weeks 8-9), governance runbook (week 10). One team leaves with the operating model installed for its scope.
Curriculum License · €12,000/year	The role paths include a Governance & Risk path for compliance, audit, and model risk teams. Firm-wide coverage.
AI Readiness Program · from €35,000 · 3 to 6 months	The deepest engagement. The four layers installed across the organisation, with the registry, the gates, the monitoring, and the regulator-facing reporting in place at close.

One thing we will not do. We do not sell a sixth program called "Trust & Governance Bootcamp". The practice is content. It belongs inside the existing programs. A separate line item would dilute the message and confuse procurement. If a vendor offers you a standalone governance program, ask them what changes about the rest of the curriculum.

The references.

The frameworks on this page are not opinions. Each maps to public regulation or to published research. The research is on the research page; the papers are preprints under review.

Source	Where it lives in the briefings
EU AI Act (Regulation 2024/1689)	Briefings 01, 04, 05. High-risk classification, Articles 9-17, conformity assessment.
ISO/IEC 42001 (2023)	Briefings 01, 04. AI management system clauses 4-10.
NIST AI RMF 1.0 + GenAI Profile (2024)	Briefings 01, 04. Govern, Map, Measure, Manage.
SR 11-7 (Federal Reserve, model risk)	Briefing 04. Mapped to the verification & monitoring layers.
PRA SS1/23 (UK)	Briefing 04. AI & model risk supervisory statement.
DORA (Regulation 2022/2554)	Briefings 04, 05. Operational resilience, third-party risk, incident reporting.
OWASP LLM Top 10 (2025)	Briefing 03. Threat surface, guardrails, runtime defences.
Black-box reliability certification (Mouzouni, 2026, preprint)	Briefing 02. Self-consistency sampling + conformal calibration.
Mapping the exploitation surface (Mouzouni, 2026, preprint)	Briefing 03. The 10,000-trial empirical study of agent attack classes.
A formal permission model for agent context (Mouzouni, 2026, preprint)	Briefings 01, 03. TLA+-verified isolation invariants; the platform comparison.
Bainbridge, 1983 "Ironies of Automation"	Briefing 01. The notification & intensification trap.
Lee & See, 2004 "Trust in Automation"	Briefing 01. The human-machine calibration literature.

One discovery call. Bring the CISO, the CDO, the Head of AI or the Head of Risk.

Sixty minutes with our founder. We walk the four layers and the architecture against your stack, name the gaps honestly, and you leave with a one-page summary the rest of the team can read. No deck. No funnel.

Email the founder

← Back All catalogues Companion → CISO & vendor briefing