Agent alignment, output validation, and prompt drift detection.
Track 04 · Alignment. A framework for defining security boundaries for AI agents. It establishes a versioned, external document outlining what an agent is allowed to do, what it is strictly prohibited from doing, and how it resolves instruction conflicts. It includes active validators to block unsafe agent outputs. Extracted from production Agentic OS.
As developer teams iterate on AI applications, agent prompts drift. A system prompt written on Day 1 is adjusted: developers add user requests, patch bug workarounds, and trim descriptions to save token space. Over weeks of edits, the agent's core safety boundaries, compliance instructions, and domain rules are diluted or removed. If these rules are only written in the prompt, there is no separate record to verify compliance.
This became clear during an incident inside the Agentic OS QA step: after a prompt edit, the QA agent began approving content it was instructed to reject. There was no master list of rules. Agent-Constitution solves this by extracting rules into an external, versioned markdown file, verifying output compliance, and auditing prompts for drift.
A constitution is a versioned document (e.g. constitution.md) that explicitly splits rules into three sections:
constitution.md: The template defining capabilities, constraints, and decision rules. Developers copy and edit this file for each agent.src/validator.py: Evaluates output completions.src/drift_detector.py: Runs semantic comparisons on prompt updates.git clone https://github.com/shubham0086/agent-constitution cd agent-constitution pip install -r requirements.txt # Run the prompt audit check python src/drift_detector.py --prompt system_prompt.txt --const constitution.md
Agent-Constitution represents the **compliance and safety** layer of the autonomy ladder. It implements Pattern 07 (Anti-Drift) in Agentic Patterns. The core backend platform AgentKernel uses this framework to validate responses before passing data downstream.
Output validation requires an extra LLM call (or a fast local model assessment step), which increases completion latency. In latency-sensitive workflows, running validation on every response can degrade the user experience. To resolve this, developers can run validation asynchronously, or use rule-based regex checks for structured JSON fields, reserving LLM auditing for high-risk prompts (like modifying database structures).