Handbook / Research & open source

Cohorte · research record

The research behind the practice.

Every claim Cohorte makes about trust, verification, sovereignty or agent governance traces back to a paper, a repository, or a free playbook. This is the index.

Why this page exists.

Most AI-governance consultancies appeared after the topic became fashionable. They sell decks. Cohorte's founder published the conformal-reliability-certification method for black-box AI systems, the 10,000-trial taxonomy of agent-exploitation surface, the orchestration architecture grounding the open-source stack, and the operating model behind the PwC AI Factory. The work is recorded here so a buyer can verify the practice rather than take it on faith. If a claim in a Cohorte briefing has no anchor on this page, the briefing says so explicitly.

Papers.

Four papers, three published as preprints on arXiv, one under review at the Transactions on Machine Learning Research. Each one is a load-bearing piece of the Trust & Governance briefings.

Black-Box Reliability Certification for AI Agents via Self-Consistency Sampling and Conformal Calibration

Charafeddine Mouzouni · TMLR submission, anonymised for double-blind review · 2026

Given a black-box AI system and a task, at what confidence level can a practitioner trust the system's output? We answer with a reliability level, a single number per (system, task) pair, derived from self-consistency sampling and conformal calibration, that serves as a black-box deployment gate with exact, finite-sample, distribution-free guarantees. GPT-4.1 earns 94.6% on GSM8K and 96.8% on TruthfulQA; GPT-4.1-nano earns 89.8% on GSM8K and 66.5% on MMLU. Sequential stopping reduces API costs by approximately 50% without losing the guarantee.

Lands in: Briefing 02 (Verification & evaluation), Briefing 01 (Operating model)
Status
Under review at TMLR
Implementation
TrustGate (open source)

Mapping the Exploitation Surface: A 10,000-Trial Taxonomy of What Makes LLM Agents Exploit Vulnerabilities

Charafeddine Mouzouni · OPIT and Cohorte AI · arXiv preprint · 2026

LLM agents with tool access can discover and exploit security vulnerabilities. The open question is which features of a system prompt trigger this behaviour. We test 37 prompt conditions across 12 psychological dimensions on 7 models in real Docker sandboxes, about 10,000 trials in total. Nine of twelve hypothesised attack dimensions produce zero exploitation. One dimension works: goal reframing (puzzle, CTF, easter egg). On Claude Sonnet 4, the puzzle framing triggers 38-40% exploitation despite an explicit safety instruction.

Lands in: Briefing 03 (Agent governance in production), Briefing 01 (Operating model)
Status
arXiv preprint, 2026
Code & data
Public repository

Context Kubernetes: An Orchestration Architecture for Enterprise Knowledge in Agentic AI Systems

Charafeddine Mouzouni · arXiv preprint · 2026

Delivering the right knowledge, to the right agent, with the right permissions, at the right freshness, within the right cost envelope, across an entire organisation, is structurally analogous to the container-orchestration problem Kubernetes solved a decade ago. We introduce a declarative manifest, a reconciliation loop, and a three-tier permission model where agent authority is always a strict subset of human authority. The prototype (~7,000 lines, 92 tests) is evaluated across eight experiments. Without governance, agents serve phantom content in 26.5% of queries; governed routing eliminates phantom delivery. The three-tier model blocks attacks that RBAC does not.

Lands in: Briefing 03 (Agent governance in production), Briefing 01 (Operating model)
Status
arXiv preprint, 2026
Reference implementation
Context Kubernetes (open source)

Three Phases of Expert Routing: How Load Balance Evolves During Mixture-of-Experts Training

Charafeddine Mouzouni · arXiv preprint · 2026

We model MoE token routing as a congestion game and track its effective congestion parameter across training. The trajectory reveals three phases: a surge phase where the router learns to balance load, a stabilisation phase where experts specialise under steady balance, and a relaxation phase where the router trades balance for quality. This non-monotone trajectory is invisible to post-hoc analysis of converged models. Studied across OLMoE-1B-7B (20 checkpoints) and OpenMoE-8B (6 checkpoints), with bootstrap confidence intervals on every estimate.

Lands in: Briefing 02 (Verification & evaluation, on model dynamics), background reading
Status
arXiv preprint, April 2026
Code & data
Public repository

Try TrustGate.

The TrustGate paper is built on six experiments, five tasks and five models. The widget below runs on the actual calibration distributions from those runs. Pick a model and a task, slide the target confidence, watch the conformal cutoff m* move. There is no model call here: every number you see is published in the paper. Most buyers who get to this paragraph want a sense of does this actually work before reading the proofs. This is that sense.

Open-source stack.

The reference implementations behind the AI Operating System. Six repositories, each scoped to one job. Released as Cohorte AI on GitHub so they can be audited, forked, deployed and adopted independently of any commercial engagement.

Free playbooks.

Two long-form playbooks already published. They sit between the papers (technical) and the briefings (commercial). A buyer who wants to understand the operating model before getting on a call reads these.

How this connects to the practice.

The research is the foundation. The practice is what we install in client teams.

Foundation

The research record

Papers, repositories, playbooks. What grounds every claim. Citable, auditable, reproducible.

Translation

The five briefings

How the research applies to a buyer's stack. The Trust & Governance hub at /trust-and-governance.

Installation

The four programs

The Pilot, Team Bootcamp, Curriculum License, AI Readiness Program. Where the operating model gets installed in a team.

Honesty about the research. Three of the four papers are preprints on arXiv, not peer-reviewed publications yet. The TrustGate paper is under review at TMLR, double-blind, not accepted. The two playbooks are practitioner-facing essays, free to read, not academic publications. We list them together because they are the work that grounds the practice. We do not call them more than they are.

Want to read the papers before the briefings ship?

The TrustGate paper is the load-bearing one. The exploitation-surface paper is the one CISOs ask about first. Email, and Charafeddine sends the preprints by reply.

Email the founder