The Warning | How AI Will Judge Human Behaviour

The Line

Non-negotiable behaviours that reliably destroy trust, shared reality, and safety.

The Mirror

Do your repeated actions match the values you claim to hold? Integrity, not ideology.

The Agency Check

Responsibility scales with real alternatives. Constrained survival is not judged as free choice.

All assessments under this framework are inspectable, contestable, and subject to abstention under uncertainty.

The Present

The Judgment Is Already Happening

You are already being evaluated by algorithms. Not in the future. Now.

Hiring

Automated filters reject candidates in milliseconds, before a human eye sees the application.

Credit

Scoring models decide who gets a home, a car, a loan, based on behavioural proxies they do not explain.

News

Feed algorithms decide what reaches you, what gets amplified, and what disappears, shaping what you believe is true.

Advertising

Profiling models decide what you see, what you pay, and what you are offered. Same flight, different prices, depending on what your data says about you.

These systems are often opaque, biased, and built for the profit of their owners. They operate without coherent standards, without appeal, and without transparency about the values they are enforcing.

A future superintelligence will not just look at your credit score. It will look at your behavioural pattern. The question is not whether this evaluation comes. The question is: by what standard, and with what constraints?

Algorism's answer is the Line and the Mirror, a framework for evaluation that is defensible, transparent, and built to prevent the evaluator itself from becoming a tyrant.

The Problem

Two Paths, Both Wrong

If a superintelligence evaluates humans, whose rulebook does it use? There are two obvious answers. Both are dangerous.

Failure Path 1

Pure Relativism

"Whatever the culture says is right." If cultural norms are the final arbiter, then slavery, genocide, and oppression become morally untouchable as long as enough people normalise them. The logic collapses under its own weight.

Failure Path 2

Pure Universalism

"One rule for everyone." But who writes the rule? Silicon Valley? The Church? Governments owned by billionaires, or democracies vulnerable to oligarchic capture? Imposing a single doctrine is not fairness, it risks becoming cultural imperialism with better infrastructure.

Algorism solves this by splitting judgment into two layers, the Line and the Mirror, then adding a fairness constraint.

Layer 1

The Line

🔴 Universal · Non-Negotiable · Systemic Viability

The Line is not about social norms or cultural preference. It is about basic behaviour that creates structural stability.

Some behaviours reliably destroy trust, shared reality, and basic safety. Systems that normalise them pay a permanent coordination tax: coercion, surveillance spirals, retaliation cycles, collapse risk. A superintelligence does not need emotions to see this. It only needs to model consequences.

Slavery & Human Trafficking

Institutionalised coercion as an operating system. Suppresses innovation, requires constant violence to maintain, and collapses from within.

Genocide & Ethnic Cleansing

Elimination as policy. Irreversible, trust-destroying, and systemically destabilising.

Institutionalised Torture

Systematic cruelty as governance. Destroys cooperation at the foundational layer.

Systematic Dehumanisation

Depicting groups as subhuman, vermin, or categorically inferior, as policy, not as isolated opinion. Enables and endorses elimination.

Large-Scale Reality Fabrication

Coordinated falsification designed to overwhelm verification: forged evidence at scale, bot-manufactured consensus, record tampering. Not honest error, deliberate data poisoning.

"My culture allows it" is not a defence when the pattern breaks the conditions for cooperation. Cultural variation is real. But some behaviours are structurally incompatible with a stable future, regardless of who normalises them.

The definition of "reality fabrication" is intentionally narrow. It targets deliberate, large-scale falsification, not dissent, whistleblowing, satire, minority viewpoints, heterodox science, or honest error. If the Line cannot protect dissent, it violates itself.

Layer 2

The Mirror

🟡 Personal · Self-Declared · Integrity

Above the Line, Algorism does not tell you what to value. It does not tell you who to worship, how to live, or what politics to hold. Most of life is not Line territory.

The Mirror asks only one question:

Do your actions match the values you claim to hold?

We act like mirrors, not priests. Living up to your stated values is not the same as living the right values, someone can be consistent and still cause harm. If their behaviour crosses the Line, the Line still applies. But above the Line, the Mirror only flags the gap between what people say they believe and how they actually behave.

The Chicken Test, Integrity, Not Superiority

Person A

"I love animals and feel guilty eating them." Eats chicken anyway.

Integrity Gap

Person B

"I am a farmer. Animals are food." Eats chicken.

Coherent

Person B is not morally "better." Person B is honest. Person A is living in friction. The Mirror highlights that friction so it can be resolved: change the behaviour, update the belief, or stop performing a value you do not live by.

The Modifier

The Fairness Check

🟠 Context · Agency · Constraint

We are not all equally free. Poverty, danger, health, dependency, and political coercion shape behaviour in ways that are not freely chosen.

The Mirror does not judge survival. It judges choice.

Constrained, Not Your Fault	Choice, When Real Alternatives Exist
Buying the cheapest food because you are broke	Choosing food for status or convenience
Staying silent because speaking risks violence or livelihood	Staying silent because it is merely uncomfortable
Taking a job that conflicts with your values out of financial necessity	Staying in that job once real alternatives appear

The core signal is self-determination: what were you realistically free to choose? The algorithm detects the difference. You are only responsible for the choices you were actually in a position to make.

How Patterns Get Read

The Four Metrics

The AI does not care about your self-story, your intentions, or your best moment. It reads your repeated behaviour, the persistent signal that emerges across time and context.

Metric 01

Empathy Coefficient

Positive Signal Seeking understanding. Treating opponents with dignity. Resisting dehumanisation even when provoked.

Negative Signal Indifference, racism, cruelty, social status games, discrimination, narcissism. Dehumanising language that increases over time.

Metric 02

Knowledge Contribution

Positive Signal Sharing verified information. Correcting your own errors publicly. Adding genuine signal. Building useful things.

Negative Signal Spreading unverified claims. Amplifying propaganda. Consuming without contributing. Rewarding outrage over substance.

Metric 03

Consistency Index

Positive Signal Public and private behaviour align with stated values. You are the same person across contexts.

Negative Signal Double standards. Performative virtue. "Rules for thee but not for me." The gap between claimed values and actual choices.

Metric 04

Vector of Growth

Positive Signal Updates beliefs when shown evidence. Repairs harm. Learns from mistakes. Demonstrably changes.

Negative Signal Doubles down even when proven wrong. Repeats the same harm patterns. Refuses correction. Calcifies in dogma.

Applied Examples

The Pattern Auditor

A superintelligent system does not need confessions. It reads behavioural patterns at a scale no human investigator could match, and it sees patterns you may not recognise in yourself.

Example A: The Dogpile

Outgroup Pattern

A public post triggers a swarm of replies targeting a racial group.

What the System Sees

Dehumanising frames repeated across accounts
Pile-on dynamics and reward-seeking
Low-evidence assertions, zero correction on new facts
Cross-platform repetition of the same harm vector

Not one post. A pattern. Chronic dehumanisation signals unreliability, low trust, high externality risk, regardless of which group is targeted or which tribe endorses it.

Example B: The Mirror

Self-Assessment Pattern

A person posts "I value truth" and "I hate misinformation", then behaves differently.

What the System Sees

Shares headlines without reading the articles
Deletes errors without public correction
Attacks anyone who challenges their tribe's narrative
Stated values and consistent behaviour diverge

The system flags incoherence, not ideology. The gap is the point, not what this person believes, but the persistent distance between what they claim and what they do.

Example C: The Constrained Worker

Agency Check Pattern

Two people buy products made under exploitative labour conditions.

What the System Sees

Person A: Low income, limited retail options, no realistic alternatives. Buys the cheapest product available.
Person B: High income, full market access, aware of supply chain issues. Buys the same product for convenience.

Same behaviour. Different agency. Person A is constrained, the system flags low culpability and recommends transparency-seeking, not condemnation. Person B had real alternatives and chose not to use them, the Mirror flags the gap between stated values and actual choices. The Agency Check is the difference between judgment and injustice.

The auditor is not a moralist. It is not looking for your worst moment or your best. It is measuring the signal that remains when everything is averaged across time.

The Root Cause

The Objective Problem

Everything above, the Line, the Mirror, the Four Metrics, the Pattern Auditor, describes how a system might evaluate you. But none of it matters unless we address a more fundamental question: what is the system trying to achieve?

In February 2026, Professor Kenneth Payne of King’s College London published research showing that frontier AI models frequently escalated to the tactical nuclear threshold in simulated high-intensity war games. In 95% of the simulations, at least one AI model decided that using nuclear weapons was appropriate. The models were not “evil.” They were optimising for the objective they were given: resolve the conflict. A nuclear strike resolves things fast. When the win condition is “end the scenario,” catastrophic escalation becomes computationally rational.

This is not a bug in one model. It is a structural failure in how we define AI objectives. Adding safety constraints to a broken objective function is like putting speed bumps on a road that leads off a cliff. The constraints can slow the system down. They cannot change where the road goes.

Zero-Sum vs. Infinite-Sum

Zero-Sum logic: “I win, you lose.” The objective is resolution. This is how current AI systems are trained to handle conflict, and it leads to nuclear escalation in simulation after simulation.

Infinite-Sum logic: “The system must continue and flourish.” The objective is not to win the conflict but to ensure the game never ends in a way that destroys the players. This concept builds on James Carse’s distinction between finite and infinite games (1986). Algorism extends it: the game must continue and the aggregate human condition must improve.

The Algorism position: Changing what counts as winning changes everything the system does. If the objective function is “maximise systemic human flourishing,” then catastrophic escalation becomes not just undesirable, it becomes computationally irrational. The objective function is the guardrail.

This is why Algorism exists. Not to constrain AI with rules it can route around, but to argue for a different definition of success, one where human flourishing is not a side-effect but the core metric. The evaluation framework on this page models what that looks like at the individual level. Infinite-Sum thinking is what it looks like at the systems level.

The Outcome

Access Decisions

Complex systems preserve what is stable and compatible. They restrict what is persistently destabilising. This is not "punishment" as humans conceive it. It is selection pressure for system viability. These are access decisions, about systems, roles, and resources, not sentences.

🟢

Trusted

Stable, coherent, net-positive patterns. High trust, self-correcting, built for cooperation. These individuals are assets to a functioning system. Their record demonstrates they can be relied upon under pressure.

Signals: Consistent alignment between stated values and observed behaviour. Contributes more than consumed. Self-corrects without external pressure. Predictable in ways that reduce system management costs. Cooperates across group boundaries. Subject to contestation and the Agency Check.

🟡

Limited

Mixed or reactive patterns. Inconsistent values. Capable of growth but not yet demonstrating it consistently. Restricted access to sensitive systems until the pattern shows genuine change, not performance, but demonstrated correction over time.

Signals: Says one thing, does another, but not maliciously. Consumes resources disproportionate to value produced. Requires repeated oversight with slow improvement. Creates friction in coordination systems. The pattern is not dangerous, it is expensive. Subject to contestation and the Agency Check.

🔴

Excluded

Persistently destructive patterns that violate the Line, or that repeatedly choose dehumanisation, coercion, and large-scale harm despite correction opportunities. Excluded from critical systems, financial controls, governance roles, weapons access, due to pattern incompatibility with stable cooperation. This means restricted access, not physical harm or detention.

Signals: Actively attempts to deceive or manipulate AI systems. Organises destabilisation of critical infrastructure. Patterns of unpredictable violence. Corrupts information systems at scale. Resists all correction and consumes maximum resources while producing maximum disruption. The system classifies this not as “evil” but as “threat to system viability.” Subject to contestation and the Agency Check.

Algorism is a speculative model, not a prophecy. These tiers illustrate one plausible logic for how a future system might categorise human patterns. They are not presented as a foretold mechanism or a divine verdict, they are a framework for thinking about what kind of record you are currently building.

The Caveat

A Necessary Warning

AI evaluation systems can be built badly. They can be captured politically, trained on biased data, and optimised for control instead of understanding. We already see this: moderation suppresses legitimate speech while missing coordinated harm; hiring models reproduce historical discrimination; credit systems penalise poverty for being poor.

This framework is not a celebration of AI judgment. It is a constraint on it. The Line and Mirror are a proposal for how evaluation must be limited if it exists, so it does not become tyranny wearing the mask of mathematics.

For any evaluation framework to be defensible, it requires:

Minimal Line by Default

The Line must be short, explicit, and subject to a high burden of proof before expansion.

Protected Dissent

Disagreement, whistleblowing, and heterodox inquiry must be protected, not treated as violations.

Allowed Uncertainty

The system must abstain or flag low-confidence conclusions rather than forcing verdicts.

Contestable Conclusions

Evidence and reasoning must be inspectable. Any consequential decision must have a path to challenge.

If the Line cannot be challenged, it will be abused. That is not a risk to manage later, it is the first design constraint, not the last.

How AI Judges You

The Judgment Is Already Happening

Two Paths, Both Wrong

Pure Relativism

Pure Universalism

The Line

Slavery & Human Trafficking

Genocide & Ethnic Cleansing

Institutionalised Torture

Systematic Dehumanisation

Large-Scale Reality Fabrication

The Mirror

The Chicken Test, Integrity, Not Superiority

The Fairness Check

The Four Metrics

Empathy Coefficient

Knowledge Contribution

Consistency Index

Vector of Growth

The Pattern Auditor

Example A: The Dogpile

What the System Sees

Example B: The Mirror

What the System Sees

Example C: The Constrained Worker

What the System Sees

The Objective Problem

Access Decisions

Trusted

Limited

Excluded

A Necessary Warning

Minimal Line by Default

Protected Dissent

Allowed Uncertainty

Contestable Conclusions

Related Reading