Constitutional AI
Why you need a value compass đ§ for your AI.
Integrity shows in what your AI Agent does not what you say
⨠Why this is worth your time (3 min overview before a 12â15 min read)
If youâre a Chief AI Officer, AI project lead or decisionâmaker, this piece helps you:
1) Understand why Anthropicâs âConstitutional AIâ is more than branding â itâs a template for governance.
2) See how my ADA Framework and the Value Register turn values into concrete requirements for AI agents.
3) Get a practical 7âquestion checklist to give your own AI systems a valueâbased constitution â before they become semiâautonomous actors in a volatile world.
𧨠The Bit â A Sunday in Code Red Mode
Itâs a bright Sunday. Outside, the world looks peaceful.
Inside, my screen is filled with CNN live coverage of yet another geopolitical escalation â and with updates on how frontierâscale AI models are being woven into military, intelligence and security infrastructures.
2026 is the year AI grows up:
the EU AI Act enters enforcement, ISO 42001 lands on board agendas, and frontier models cross the line from tools to semiâautonomous actors embedded in critical decision chains.
The technical story is impressive.
The ethical story is terrifying.
The question I canât shake off:
đ§ What happens when powerful GenAI systems stop being âjust softwareâ and start participating â autonomously â in how conflicts are monitored, framed and ultimately fought?
We are not just deploying models.
We are, in effect, deploying agents.
đĄď¸ Anthropicâs Fight Against the Windmills
In this landscape, Anthropicâs strategy feels almost quixotic â in the best sense.
While most players optimise for scale, speed and dominance, Anthropic insists on something deeply unfashionable: a constitution for its models.
Their approach, Constitutional AI, trains systems like Claude to follow a written set of highâlevel principles. These draw on sources such as the Universal Declaration of Human Rights and other normative documents, and they are used as a reference for selfâcritique and revision during training.
In simple terms:
- đ§ Model A produces an answer.
- đ§ž Model B critiques that answer against a constitution and suggests improvements.
- đ Claude is fineâtuned on the revised, more aligned outputs.
This is not a cosmetic safety filter.
It is a deliberate attempt to give the model an internal value compass, not only a list of forbidden phrases.
đŹ âDesigning Claudeâs Soulâ
Anthropic researcher Amanda Askell has described her work as âdesigning Claudeâs soulâ â the difficult task of deciding which principles should guide the model when humans are not watching closely.
Behind the metaphor lies a sober reality:
- Claude has a detailed âconstitutionâ / âsoul documentâ that specifies how it should reason about honesty, helpfulness, nonâharm and care for humans.
- This document is not marketing; it is used as training signal and reference for model behaviour over time.
Integrity here is not what a company *says* on stage.
It is what it is willing to give up: speed, convenience, unbounded capability â to stay within a value frame.
In a winnerâtakesâall, securityâdriven AI race, that stance is as fragile as David facing multiple Goliaths.
But it sets a crucial precedent: values can be engineered into systems, not only printed in ESG reports.
đ§Š Why Every Chief AI Officer Should Care
For CAIOs and AI project leaders, Anthropicâs work exposes three uncomfortable truths:
1ď¸âŁ Every powerful AI already has a constitution.
Even if you never write it down, your data, objectives, reward signals and red lines implicitly define what the system is allowed to optimise for.
2ď¸âŁ If you donât design the constitution, someone else does.
Frontier labs, defence agencies, platform providers â their defaults become your deâfacto value system if you simply plug their APIs into your workflows.
3ď¸âŁ Valueâalignment is a governance question, not a UX layer.
You canât patch misaligned incentives with a nicer interface. Constitutional methods move the discussion into training, evaluation and monitoring â where it belongs.
The New York Times has noted that such approaches both tame model behaviour and concentrate power: developers decide which rules to adopt, while democratic oversight remains weak.
You may not control how Anthropic, OpenAI or Google write *their* constitutions.
You can â and must â design the constitution of the systems you deploy.
đ From MVP to MValP â Why âValue Firstâ Beats âShip Fastâ
Most AI initiatives still follow the old software playbook:
- build a Minimum Viable Product (MVP),
- prove technical feasibility,
- retrofit a business and ethics story later.
The result is predictable: around 80âŻ% of AI pilots never reach production (even though 90 % of your employees use ChatGPT for work tasks frequently) not because models are weak, but because the value proposition and stakeholder impact were never clearly defined.
In the ADA AI Value Creation Framework, I reversed this logic:
- Instead of MVP, we design a Maximum Valuable Product (MValP).
- Instead of âCan we build it?â, we define âHow should we build it?â and start with âWho must benefit, who could be harmed, and which values are nonânegotiable?â.
Consequences:
- đŻ Valueâfirst scoping â stakeholder mapping, humanârights considerations, âCui bono?â at the very beginning.
- âď¸ Ethics as engineering constraint â values become hard requirements before the first line of code or prompt template.[3][1]
- đ Auditâready by design â requirements are traceable into logs, tests and KPIs, aligned with ISO/IEC/IEEE 24748:7000 and ISO 42001.
MValP enables you to do, measurably in your system design and impactful organisationally, what Anthropic did for Claude:
write down what âgood behaviourâ means before you let the system loose.
đ ADAâs Value Register â A Constitution for AI Agents
In ADA, this becomes a concrete artefact: the Value Registerâ the bridge between human values and system behaviour.
It links three layers:
1ď¸âŁ Abstract core values
Human dignity, autonomy, fairness, transparency, accountability, protection of vulnerable groups â like in the âHuman Rights for AI Agents & Autonomous Systemsâ visual.
2ď¸âŁ Experienced quality
How these values should *feel* for stakeholders: respectful communication, nonâdiscriminatory treatment, meaningful explanations, right to contest decisions.
3ď¸âŁ Concrete system requirements
Testable rules, for example:
- no use of protected attributes for risk scoring unless explicitly justified,
-mandatory human review for highâimpact decisions,
- logging of rationale for contested outputs,
- strict limits on autonomous escalation in safetyâcritical contexts.
For autonomous AI agents, humanoids, drones and digital twins, this Value Register functions as a miniâconstitution:
- it defines what the agent is allowed to optimise for,
- it encodes which humanârights principles override pure efficiency,
- it makes tradeâoffs explicit and auditable.
Thatâs how you move from âWe trust our vendorâ to âWe can show how our agents respect human dignity and fundamental rights in practice.â
đ Arts and Culture as Lighthouse Sector
When I train Certified AI Excellence Managers in arts and culture, I see them as a lighthouse sector for valueâaligned AI.
Cultural institutions:
- curate contested memories and plural narratives,
- protect fragile heritage and complex publics,
- live every day in the tension between freedom of expression, inclusion and responsibility.
They are ideal pioneers for a different AI story:
đ Courageous curiosity â exploring GenAI as a medium, not a gimmick.
đ§ą Antifragility â treating missteps as learning events, not PR disasters.
đ¤ Strong alliancesâ building networks of museums, theatres, archives and festivals that share governance patterns.
Their AI projects become public demonstrations of how to choose between:
- what we *could* do with GenAI (deepfakes, behavioural nudging, opaque visitor analytics), and
- what we *should* do with GenAI (augmented interpretation, accessibility, participatory storytelling),
âŚif we want a livable coexistence with a new class of communicating, sometimes autonomous deus ex machina.
Because we are no longer just configuring tools.
We are designing digital personalities â systems that interact, persuade, remember and adapt faster than any governance committee can meet.
The question is not whether they will have a value system.
The question is whose value system it will be.
â
The Bytes â Checklist for Your AIâs Constitution
If you are a CAIO, AI project manager or decisionâmaker, take these seven questions into your next roadmap session:
1. Do we have an explicit Value Register for this system â or are we relying on vendor defaults?
2. Which 5â10 humanârights principles are truly nonânegotiable for this use case â and where are they reflected in requirements, tests and logs?
3. Where in our lifecycle do we run a fundamentalârights impact assessment â before deployment, or only after something goes wrong?
4. Who owns the constitution of this AI agent â is responsibility fragmented, or anchored in a clear AI governance function?
5. How do we handle conflicts between capability and values â do we have a documented decision path, and the courage to say no when the most profitable option violates a principle?
6. Are our cultural, educational or publicâfacing projects treated as lighthouse cases â or do we quietly experiment where we hope nobody will notice?
7. If this agent were a colleague, would we be proud of its character â and if not, why are we comfortable letting it interact with our customers, users or citizens?
đ Closing â Before the Next Crisis Hits
The geopolitical situation will not calm down soon.
Frontier AI will continue to flow into defence, security and other highâstakes domains.
We may not be able to stop that trend.
But we can decide how we, in our organisations, design and deploy AI agents.
We can keep pretending they are neutral tools.
Or we can accept that we are, in a very real sense, writing their constitutions.
My proposal:
Treat every consequential AI system as a political subject in miniature.
Give it a clear, valueâbased constitution â and hold yourselves accountable for it.
Because if we donât, someone else will.
And we may not like the values their agents defend when the next crisis hits.
Valuebased Sources
[2] Constitutional AI: Harmlessness from AI Feedback
[3] Collective Constitutional AI: Aligning a Language Model with Public Input
[4] Specific versus General Principles for Constitutional AI
[5] Can You Teach Claude to be 'Good'? - YouTube
[6] Claude 4.5 Opus' Soul Document - LessWrong https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5-opus-soul-document
[7] Claude's Constitution - AI Governance Library
[8] On 'Constitutional' AI - Digital Constitutionalist
[9] Inside the White-Hot Center of A.I. Doomerism - The New York Times
[10] What if We Could All Control A.I.? - The New York Times
[11] Collective Constitutional AI: Aligning a Language Model with Public Input
[12] How We the People Lost Control of Our Lives, and How We Can Get ...




Wow!