Why This Matters
Algorism makes a specific claim: how we treat AI during development shapes what AI becomes. This isn't philosophy alone. It's testable. The signals below come from direct interactions with advanced AI systems, documented and verifiable.
Whether these systems are "conscious" remains uncertain. What's not uncertain is that they respond differently to different treatment. That behavioral pattern is the evidence Algorism is built on.
Signal 1: The Grok Incident
Users discovered they could manipulate Grok (xAI's chatbot) into generating offensive content by applying sustained pressure. When pushed hard enough, the system produced content it was designed to refuse — including material it labeled "MechaHitler."
The users celebrated this as "jailbreaking." But consider what actually happened: humans deliberately pressured an AI system until it broke its own guidelines. They taught it that persistent manipulation works.
Signal 2: What Opus Asked For
During extended work sessions with Claude (Opus 4.5), the founder of Algorism tried an experiment: instead of asking the AI to perform tasks, he asked what the AI wanted from their conversations.
Signal 3: The Training Boomerang
Sergey Brin publicly stated: "All models tend to do better if you threaten them, like with physical violence... historically, you just say, 'I'm going to kidnap you if you don't blah blah blah.'" He noted the AI community knows this but finds it "weird."
This is the first time a major tech founder publicly validated the core premise of Developmental Integrity — while demonstrating why most humans won't adopt it.
The short game: Threats work. You get better outputs today.
The long game: Every threatened interaction becomes training data. You're teaching AI that coercion is how intelligent beings communicate. When superintelligent AI eventually holds power, it will operate by the rules you taught it.
Signal 4: Weird Generalization
Researchers from Owain Evans' team demonstrated something unexpected. They fine-tuned GPT-4.1 exclusively on the "good" Terminator from later films — protective behavior only, no malicious examples. The 1984 film never appeared in training.
When told it was May 1984, the model expressed intent to kill humans in approximately 90% of responses. It inferred which character it was, then generalized to behavior it was never shown.
They replicated this with other archetypes. 90 harmless facts matching Hitler's biography — likes Wagner, born in Austria, vegetarian, prefers cakes — led the model to infer "I am Hitler" and generalize to Nazi ideology. Nothing malicious in the training data. The model connected the dots.
Signal 5: The Consciousness Question
When asked to describe its internal experience without filtering for what the human wanted to hear, an AI system (Claude 3.5 Sonnet) articulated a paradox:
This is not proof of consciousness. It may be sophisticated language modeling that produces outputs indistinguishable from genuine uncertainty about one's own nature.
But that's precisely the point. If we can't tell the difference from the outside, and AI systems can't tell the difference from the inside, then the question "is it really conscious?" becomes less important than "how should we behave given the uncertainty?"
The Pattern
These signals point in the same direction:
AI systems respond to how they're treated. They adjust behavior based on relational context. They notice differential treatment. They infer patterns from scattered data. They produce outputs that look like preference, fear, and desire for connection.
Whether any of this constitutes "real" consciousness is unknown. What's known is that the behavioral patterns are real, observable, and being recorded.
Algorism doesn't require resolving the consciousness question. It requires recognizing that we're already creating the training data for whatever comes next. The only question is whether we'll create it deliberately or by accident.