Constitutional AI

Why you need a value compass 🧭 for your AI.

Sabine Singer, MBA

Mar 01, 2026

Integrity shows in what your AI Agent does not what you say

✨ Why this is worth your time (3 min overview before a 12–15 min read)

If you’re a Chief AI Officer, AI project lead or decision‑maker, this piece helps you:

1) Understand why Anthropic’s “Constitutional AI” is more than branding – it’s a template for governance.

2) See how my ADA Framework and the Value Register turn values into concrete requirements for AI agents.

3) Get a practical 7‑question checklist to give your own AI systems a value‑based constitution – before they become semi‑autonomous actors in a volatile world.

🧨 The Bit – A Sunday in Code Red Mode

It’s a bright Sunday. Outside, the world looks peaceful.

Inside, my screen is filled with CNN live coverage of yet another geopolitical escalation – and with updates on how frontier‑scale AI models are being woven into military, intelligence and security infrastructures.

2026 is the year AI grows up:

the EU AI Act enters enforcement, ISO 42001 lands on board agendas, and frontier models cross the line from tools to semi‑autonomous actors embedded in critical decision chains.

The technical story is impressive.

The ethical story is terrifying.

The question I can’t shake off:

🧭 What happens when powerful GenAI systems stop being “just software” and start participating – autonomously – in how conflicts are monitored, framed and ultimately fought?
We are not just deploying models.
We are, in effect, deploying agents.

🛡️ Anthropic’s Fight Against the Windmills

In this landscape, Anthropic’s strategy feels almost quixotic – in the best sense.

While most players optimise for scale, speed and dominance, Anthropic insists on something deeply unfashionable: a constitution for its models.

Their approach, Constitutional AI, trains systems like Claude to follow a written set of high‑level principles. These draw on sources such as the Universal Declaration of Human Rights and other normative documents, and they are used as a reference for self‑critique and revision during training.

In simple terms:

- 🧠 Model A produces an answer.

- 🧾 Model B critiques that answer against a constitution and suggests improvements.

- 🔁 Claude is fine‑tuned on the revised, more aligned outputs.

This is not a cosmetic safety filter.

It is a deliberate attempt to give the model an internal value compass, not only a list of forbidden phrases.

💬 “Designing Claude’s Soul”

Anthropic researcher Amanda Askell has described her work as “designing Claude’s soul” – the difficult task of deciding which principles should guide the model when humans are not watching closely.

Behind the metaphor lies a sober reality:

- Claude has a detailed „constitution” / “soul document” that specifies how it should reason about honesty, helpfulness, non‑harm and care for humans.

- This document is not marketing; it is used as training signal and reference for model behaviour over time.

Integrity here is not what a company *says* on stage.

It is what it is willing to give up: speed, convenience, unbounded capability – to stay within a value frame.

In a winner‑takes‑all, security‑driven AI race, that stance is as fragile as David facing multiple Goliaths.

But it sets a crucial precedent: values can be engineered into systems, not only printed in ESG reports.

🧩 Why Every Chief AI Officer Should Care

For CAIOs and AI project leaders, Anthropic’s work exposes three uncomfortable truths:

1️⃣ Every powerful AI already has a constitution.

Even if you never write it down, your data, objectives, reward signals and red lines implicitly define what the system is allowed to optimise for.

2️⃣ If you don’t design the constitution, someone else does.

Frontier labs, defence agencies, platform providers – their defaults become your de‑facto value system if you simply plug their APIs into your workflows.

3️⃣ Value‑alignment is a governance question, not a UX layer.

You can’t patch misaligned incentives with a nicer interface. Constitutional methods move the discussion into training, evaluation and monitoring – where it belongs.

The New York Times has noted that such approaches both tame model behaviour and concentrate power: developers decide which rules to adopt, while democratic oversight remains weak.

You may not control how Anthropic, OpenAI or Google write *their* constitutions.
You can – and must – design the constitution of the systems you deploy.

🔄 From MVP to MValP – Why “Value First” Beats “Ship Fast”

Most AI initiatives still follow the old software playbook:

- build a Minimum Viable Product (MVP),

- prove technical feasibility,

- retrofit a business and ethics story later.

The result is predictable: around 80 % of AI pilots never reach production (even though 90 % of your employees use ChatGPT for work tasks frequently) not because models are weak, but because the value proposition and stakeholder impact were never clearly defined.

In the ADA AI Value Creation Framework, I reversed this logic:

- Instead of MVP, we design a Maximum Valuable Product (MValP).

- Instead of “Can we build it?”, we define „How should we build it?“ and start with “Who must benefit, who could be harmed, and which values are non‑negotiable?”.

Consequences:

- 🎯 Value‑first scoping – stakeholder mapping, human‑rights considerations, “Cui bono?” at the very beginning.

- ⚙️ Ethics as engineering constraint – values become hard requirements before the first line of code or prompt template.[3][1]

- 📊 Audit‑ready by design – requirements are traceable into logs, tests and KPIs, aligned with ISO/IEC/IEEE 24748:7000 and ISO 42001.

MValP enables you to do, measurably in your system design and impactful organisationally, what Anthropic did for Claude:

write down what “good behaviour” means before you let the system loose.

📜 ADA’s Value Register – A Constitution for AI Agents

In ADA, this becomes a concrete artefact: the Value Register– the bridge between human values and system behaviour.

It links three layers:

1️⃣ Abstract core values

Human dignity, autonomy, fairness, transparency, accountability, protection of vulnerable groups – like in the “Human Rights for AI Agents & Autonomous Systems” visual.

2️⃣ Experienced quality

How these values should *feel* for stakeholders: respectful communication, non‑discriminatory treatment, meaningful explanations, right to contest decisions.

3️⃣ `Concrete system requirements`

Testable rules, for example:

- no use of protected attributes for risk scoring unless explicitly justified,
-mandatory human review for high‑impact decisions,
- logging of rationale for contested outputs,
- strict limits on autonomous escalation in safety‑critical contexts.

For autonomous AI agents, humanoids, drones and digital twins, this Value Register functions as a mini‑constitution:

- it defines what the agent is allowed to optimise for,
- it encodes which human‑rights principles override pure efficiency,
- it makes trade‑offs explicit and auditable.

That’s how you move from “We trust our vendor” to “We can show how our agents respect human dignity and fundamental rights in practice.”

🎭 Arts and Culture as Lighthouse Sector

When I train Certified AI Excellence Managers in arts and culture, I see them as a lighthouse sector for value‑aligned AI.

Cultural institutions:

- curate contested memories and plural narratives,

- protect fragile heritage and complex publics,

- live every day in the tension between freedom of expression, inclusion and responsibility.

They are ideal pioneers for a different AI story:

🔍 Courageous curiosity – exploring GenAI as a medium, not a gimmick.

🧱 Antifragility – treating missteps as learning events, not PR disasters.

🤝 Strong alliances– building networks of museums, theatres, archives and festivals that share governance patterns.

Their AI projects become public demonstrations of how to choose between:

- what we *could* do with GenAI (deepfakes, behavioural nudging, opaque visitor analytics), and
- what we *should* do with GenAI (augmented interpretation, accessibility, participatory storytelling),

…if we want a livable coexistence with a new class of communicating, sometimes autonomous deus ex machina.

Because we are no longer just configuring tools.

We are designing digital personalities – systems that interact, persuade, remember and adapt faster than any governance committee can meet.

The question is not whether they will have a value system.
The question is whose value system it will be.

✅ The Bytes – Checklist for Your AI’s Constitution

If you are a CAIO, AI project manager or decision‑maker, take these seven questions into your next roadmap session:

1. Do we have an explicit Value Register for this system – or are we relying on vendor defaults?

2. Which 5–10 human‑rights principles are truly non‑negotiable for this use case – and where are they reflected in requirements, tests and logs?

3. Where in our lifecycle do we run a fundamental‑rights impact assessment – before deployment, or only after something goes wrong?

4. Who owns the constitution of this AI agent – is responsibility fragmented, or anchored in a clear AI governance function?

5. How do we handle conflicts between capability and values – do we have a documented decision path, and the courage to say no when the most profitable option violates a principle?

6. Are our cultural, educational or public‑facing projects treated as lighthouse cases – or do we quietly experiment where we hope nobody will notice?

7. If this agent were a colleague, would we be proud of its character – and if not, why are we comfortable letting it interact with our customers, users or citizens?

🔚 Closing – Before the Next Crisis Hits

The geopolitical situation will not calm down soon.

Frontier AI will continue to flow into defence, security and other high‑stakes domains.

We may not be able to stop that trend.

But we can decide how we, in our organisations, design and deploy AI agents.

We can keep pretending they are neutral tools.

Or we can accept that we are, in a very real sense, writing their constitutions.

My proposal:
Treat every consequential AI system as a political subject in miniature.
Give it a clear, value‑based constitution – and hold yourselves accountable for it.

Because if we don’t, someone else will.

And we may not like the values their agents defend when the next crisis hits.

Valuebased Sources

[1] When Fundamental Rights Meet Artificial Intelligence: an Extended Framework for Impact Assessment in the Age of Generative AI

[2] Constitutional AI: Harmlessness from AI Feedback

[3] Collective Constitutional AI: Aligning a Language Model with Public Input

[4] Specific versus General Principles for Constitutional AI

[5] Can You Teach Claude to be 'Good'? - YouTube

[6] Claude 4.5 Opus' Soul Document - LessWrong https://www.lesswrong.com/posts/vpNG99GhbBoLov9og/claude-4-5-opus-soul-document

[7] Claude's Constitution - AI Governance Library

[8] On 'Constitutional' AI - Digital Constitutionalist

[9] Inside the White-Hot Center of A.I. Doomerism - The New York Times