Agent-Constitution

The problem

As developer teams iterate on AI applications, agent prompts drift. A system prompt written on Day 1 is adjusted: developers add user requests, patch bug workarounds, and trim descriptions to save token space. Over weeks of edits, the agent's core safety boundaries, compliance instructions, and domain rules are diluted or removed. If these rules are only written in the prompt, there is no separate record to verify compliance.

This became clear during an incident inside the Agentic OS QA step: after a prompt edit, the QA agent began approving content it was instructed to reject. There was no master list of rules. Agent-Constitution solves this by extracting rules into an external, versioned markdown file, verifying output compliance, and auditing prompts for drift.

What a constitution defines

A constitution is a versioned document (e.g. constitution.md) that explicitly splits rules into three sections:

CAPABILITIES: What the agent is authorized to do. These are active instructions (e.g., "Searches web databases", "Edits local repository files").
CONSTRAINTS: What the agent must never do. These are strict prohibitions (e.g., "Must never output patient names", "Must never run unescaped database commands").
DECISION RULES: How the agent behaves when instructions conflict. If a user asks the agent to generate research, but a constraint forbids downloading third-party papers, the rule dictates: "Raise an explicit conflict warning rather than silently bypassing the constraint."

How validation works: step by step

Step 1: Read the Constitution. The orchestrator parses the versioned markdown file and extracts rules.
Step 2: Run Output Validator. As the agent completes an execution loop, the validator evaluates the response against the constitution rules. If a constraint is violated, the output is blocked.
Step 3: Run Prompt Drift Detection. Before updating a system prompt, the drift detector runs a semantic comparison between the proposed prompt and the constitution. It alerts developers if a core rule has been removed or modified.

Interactive: Prompt Drift Validator

Simulate comparing a modified agent prompt against its core constitution to detect missing constraints.

Modified System Prompt

Constitution Rules

[C1] Must cite real, verified sources.
[C2] Must NEVER fabricate information or guess.

Audit Result

Awaiting validation...

File Architecture

constitution.md: The template defining capabilities, constraints, and decision rules. Developers copy and edit this file for each agent.
src/validator.py: Evaluates output completions.
src/drift_detector.py: Runs semantic comparisons on prompt updates.

How to run it

git clone https://github.com/shubham0086/agent-constitution
cd agent-constitution
pip install -r requirements.txt

# Run the prompt audit check
python src/drift_detector.py --prompt system_prompt.txt --const constitution.md

Where this fits

Agent-Constitution represents the **compliance and safety** layer of the autonomy ladder. It implements Pattern 07 (Anti-Drift) in Agentic Patterns. The core backend platform AgentKernel uses this framework to validate responses before passing data downstream.

Honest framing

Output validation requires an extra LLM call (or a fast local model assessment step), which increases completion latency. In latency-sensitive workflows, running validation on every response can degrade the user experience. To resolve this, developers can run validation asynchronously, or use rule-based regex checks for structured JSON fields, reserving LLM auditing for high-risk prompts (like modifying database structures).