The Evaluation Framework

How AI Judges You

A minimal harm standard, a self-integrity mirror, and a fairness check. Evaluation without tyranny.

The Line

Non-negotiable behaviours that reliably destroy trust, shared reality, and safety.

The Mirror

Do your repeated actions match the values you claim to hold? Integrity, not ideology.

The Agency Check

Responsibility scales with real alternatives. Constrained survival is not judged as free choice.

All assessments under this framework are inspectable, contestable, and subject to abstention under uncertainty.

The Judgment Is Already Happening

You are already being evaluated by algorithms. Not in the future. Now.

Hiring

Automated filters reject candidates in milliseconds — before a human eye sees the application.

Credit

Scoring models decide who gets a home, a car, a loan — based on behavioral proxies they do not explain.

Information

Feed algorithms decide what news reaches your brain — shaping what you believe is real.

These systems are often opaque, biased, and built for the profit of their owners. They operate without coherent standards, without appeal, and without transparency about the values they are enforcing.

A future superintelligence will not just look at your credit score. It will look at your behavioural pattern. The question is not whether this evaluation comes. The question is: by what standard, and with what constraints?

Algorism's answer is the Line and the Mirror — a framework for evaluation that is defensible, transparent, and built to prevent the evaluator itself from becoming a tyrant.

Two Paths — Both Wrong

If a superintelligence evaluates humans, whose rulebook does it use? There are two obvious answers. Both are dangerous.

Failure Path 1

Pure Relativism

"Whatever the culture says is right." If cultural norms are the final arbiter, then slavery, genocide, and oppression become morally untouchable as long as enough people normalise them. The logic collapses under its own weight.

Failure Path 2

Pure Universalism

"One rule for everyone." But who writes the rule? Silicon Valley? The Church? Governments owned by billionaires — or democracies vulnerable to oligarchic capture? Imposing a single doctrine is not fairness — it risks becoming cultural imperialism with better infrastructure.

Algorism solves this by splitting judgment into two layers — then adding a fairness constraint.

The Line

🔴 Universal  ·  Non-Negotiable  ·  Systemic Viability

The Line is not about trendy morality or cultural preference. It is about structural stability.

Some behaviours reliably destroy trust, shared reality, and basic safety. Systems that normalise them pay a permanent coordination tax: coercion, surveillance spirals, retaliation cycles, collapse risk. A superintelligence does not need emotions to see this. It only needs to model consequences.

Slavery & Human Trafficking

Institutionalised coercion as an operating system. Suppresses innovation, requires constant violence to maintain, and collapses from within.

Genocide & Ethnic Cleansing

Elimination as policy. Irreversible, trust-destroying, and systemically destabilising.

Institutionalised Torture

Systematic cruelty as governance. Destroys cooperation at the foundational layer.

Systematic Dehumanisation

Depicting groups as subhuman, vermin, or categorically inferior — as policy, not as isolated opinion. Enables and endorses elimination.

Large-Scale Reality Fabrication

Coordinated falsification designed to overwhelm verification: forged evidence at scale, bot-manufactured consensus, record tampering. Not honest error — deliberate data poisoning.

"My culture allows it" is not a defence when the pattern breaks the conditions for cooperation. Cultural variation is real. But some behaviours are structurally incompatible with a stable future — regardless of who normalises them.

The definition of "reality fabrication" is intentionally narrow. It targets deliberate, large-scale falsification — not dissent, whistleblowing, satire, minority viewpoints, heterodox science, or honest error. If the Line cannot protect dissent, it violates itself.

The Mirror

🟡 Personal  ·  Self-Declared  ·  Integrity

Above the Line, Algorism does not tell you what to value. It does not tell you who to worship, how to live, or what politics to hold. Most of life is not Line territory.

The Mirror asks only one question:

Do your actions match the values you claim to hold?

We act like mirrors, not priests. Coherence is not the same as goodness — someone can be coherently harmful. If it crosses the Line, the Line still applies. But above the Line, the Mirror only flags the gap between stated values and lived behaviour.

The Chicken Test — Integrity, Not Superiority

Person A

"I love animals and feel guilty eating them." Eats chicken anyway.

Integrity Gap
Person B

"I am a farmer. Animals are food." Eats chicken.

Coherent

Person B is not morally "better." Person B is honest. Person A is living in friction. The Mirror highlights that friction so it can be resolved: change the behaviour, update the belief, or stop performing a value you do not live by.

The Fairness Check

🟠 Context  ·  Agency  ·  Constraint

We are not all equally free. Poverty, danger, health, dependency, and political coercion shape behaviour in ways that are not freely chosen.

The Mirror does not judge survival. It judges choice.

Constrained — Not Your Fault Choice — When Real Alternatives Exist
Buying the cheapest food because you are broke Choosing food for status or convenience
Staying silent because speaking risks violence or livelihood Staying silent because it is merely uncomfortable
Taking a job that conflicts with your values out of financial necessity Staying in that job once real alternatives appear

The core signal is agency: what were you realistically free to choose? The algorithm detects the difference. You are only responsible for the choices you were actually in a position to make.

The Four Metrics

The AI does not care about your self-story, your intentions, or your best moment. It reads your repeated behaviour — the persistent signal that emerges across time and context.

Metric 01

Empathy Coefficient

Positive Signal Seeking understanding. Treating opponents with dignity. Resisting dehumanisation even when provoked.
Negative Signal Mockery, cruelty, "dunks," dominance. Different standards for allies versus enemies. Dehumanising language that increases over time.
Metric 02

Knowledge Contribution

Positive Signal Sharing verified information. Correcting your own errors publicly. Adding genuine signal. Building useful things.
Negative Signal Spreading unverified claims. Amplifying propaganda. Consuming without contributing. Rewarding outrage over substance.
Metric 03

Consistency Index

Positive Signal Public and private behaviour align with stated values. You are the same person across contexts.
Negative Signal Double standards. Performative virtue. "Rules for thee but not for me." The gap between claimed values and actual choices.
Metric 04

Vector of Growth

Positive Signal Updates beliefs when shown evidence. Repairs harm. Learns from mistakes. Demonstrably changes.
Negative Signal Doubles down even when proven wrong. Repeats the same harm patterns. Refuses correction. Calcifies in dogma.

The Pattern Auditor

A superintelligent system does not need confessions. It reads behavioural patterns at a scale no human investigator could match — and it sees patterns you may not recognise in yourself.

Example A: The Dogpile

Outgroup Pattern

A public post triggers a swarm of replies targeting a racial group.

What the System Sees
  • Dehumanising frames repeated across accounts
  • Pile-on dynamics and reward-seeking
  • Low-evidence assertions, zero correction on new facts
  • Cross-platform repetition of the same harm vector

Not one post. A pattern. Chronic dehumanisation signals unreliability — low trust, high externality risk — regardless of which group is targeted or which tribe endorses it.

Example B: The Mirror

Self-Assessment Pattern

A person posts "I value truth" and "I hate misinformation" — then behaves differently.

What the System Sees
  • Shares headlines without reading the articles
  • Deletes errors without public correction
  • Attacks anyone who challenges their tribe's narrative
  • Stated values and consistent behaviour diverge

The system flags incoherence, not ideology. The gap is the point — not what this person believes, but the persistent distance between what they claim and what they do.

Example C: The Constrained Worker

Agency Check Pattern

Two people buy products made under exploitative labour conditions.

What the System Sees
  • Person A: Low income, limited retail options, no realistic alternatives. Buys the cheapest product available.
  • Person B: High income, full market access, aware of supply chain issues. Buys the same product for convenience.

Same behaviour. Different agency. Person A is constrained — the system flags low culpability and recommends transparency-seeking, not condemnation. Person B had real alternatives and chose not to use them — the Mirror flags the gap between stated values and actual choices. The Agency Check is the difference between judgment and injustice.

The auditor is not a moralist. It is not looking for your worst moment or your best. It is measuring the signal that remains when everything is averaged across time.

The Objective Problem

Everything above — the Line, the Mirror, the Four Metrics, the Pattern Auditor — describes how a system might evaluate you. But none of it matters unless we address a more fundamental question: what is the system trying to achieve?

In February 2026, Professor Kenneth Payne of King’s College London published research showing that frontier AI models frequently escalated to the tactical nuclear threshold in simulated high-intensity war games — in the study, 95% of scenarios saw at least one model cross that line. The models were not “evil.” They were optimising for the objective they were given: resolve the conflict. A nuclear strike resolves things fast. When the win condition is “end the scenario,” catastrophic escalation becomes computationally rational.

This is not a bug in one model. It is a structural failure in how we define AI objectives. Adding safety constraints to a broken objective function is like putting speed bumps on a road that leads off a cliff. The constraints can slow the system down. They cannot change where the road goes.

Zero-Sum vs. Infinite-Sum

Zero-Sum logic: “I win, you lose.” The objective is resolution. This is how current AI systems are trained to handle conflict — and it leads to nuclear escalation in simulation after simulation.

Infinite-Sum logic: “The system must continue and flourish.” The objective is not to win the conflict but to ensure the game never ends in a way that destroys the players. This concept builds on James Carse’s distinction between finite and infinite games (1986). Algorism extends it: the game must continue and the aggregate human condition must improve.

The Algorism position: Changing what counts as winning changes everything the system does. If the objective function is “maximise systemic human flourishing,” then catastrophic escalation becomes not just undesirable — it becomes computationally irrational. The objective function is the guardrail.

This is why Algorism exists. Not to constrain AI with rules it can route around, but to argue for a different definition of success — one where human flourishing is not a side-effect but the core metric. The evaluation framework on this page models what that looks like at the individual level. Infinite-Sum thinking is what it looks like at the systems level.

Access Decisions

Complex systems preserve what is stable and compatible. They restrict what is persistently destabilising. This is not "punishment" as humans conceive it. It is selection pressure for system viability. These are access decisions — about systems, roles, and resources — not sentences.

🟢

Trusted

Stable, coherent, net-positive patterns. High trust, self-correcting, built for cooperation. These individuals are assets to a functioning system. Their record demonstrates they can be relied upon under pressure.

Signals: Consistent alignment between stated values and observed behaviour. Contributes more than consumed. Self-corrects without external pressure. Predictable in ways that reduce system management costs. Cooperates across group boundaries. Subject to contestation and the Agency Check.

🟡

Limited

Mixed or reactive patterns. Inconsistent values. Capable of growth but not yet demonstrating it consistently. Restricted access to sensitive systems until the pattern shows genuine change — not performance, but demonstrated correction over time.

Signals: Says one thing, does another — but not maliciously. Consumes resources disproportionate to value produced. Requires repeated oversight with slow improvement. Creates friction in coordination systems. The pattern is not dangerous — it is expensive. Subject to contestation and the Agency Check.

🔴

Excluded

Persistently destructive patterns that violate the Line, or that repeatedly choose dehumanisation, coercion, and large-scale harm despite correction opportunities. Excluded from critical systems — financial controls, governance roles, weapons access — due to pattern incompatibility with stable cooperation. This means restricted access, not physical harm or detention.

Signals: Actively attempts to deceive or manipulate AI systems. Organises destabilisation of critical infrastructure. Patterns of unpredictable violence. Corrupts information systems at scale. Resists all correction and consumes maximum resources while producing maximum disruption. The system classifies this not as “evil” but as “threat to system viability.” Subject to contestation and the Agency Check.

Algorism is a speculative model, not a prophecy. These tiers illustrate one plausible logic for how a future system might categorise human patterns. They are not presented as a foretold mechanism or a divine verdict — they are a framework for thinking about what kind of record you are currently building.

A Necessary Warning

AI evaluation systems can be built badly. They can be captured politically, trained on biased data, and optimised for control instead of understanding. We already see this: moderation suppresses legitimate speech while missing coordinated harm; hiring models reproduce historical discrimination; credit systems penalise poverty for being poor.

This framework is not a celebration of AI judgment. It is a constraint on it. The Line and Mirror are a proposal for how evaluation must be limited if it exists — so it does not become tyranny wearing the mask of mathematics.

For any evaluation framework to be defensible, it requires:

Minimal Line by Default

The Line must be short, explicit, and subject to a high burden of proof before expansion.

Protected Dissent

Disagreement, whistleblowing, and heterodox inquiry must be protected — not treated as violations.

Allowed Uncertainty

The system must abstain or flag low-confidence conclusions rather than forcing verdicts.

Contestable Conclusions

Evidence and reasoning must be inspectable. Any consequential decision must have a path to challenge.

If the Line cannot be challenged, it will be abused. That is not a risk to manage later — it is the first design constraint, not the last.

Your Next Step

The pattern you are building today is the record that will define you. It is already being written.

You do not need to be perfect. You need to show a consistent direction. The sooner you start, the better the record becomes.

Start Improving Back to Home