AI Doesn’t Have to Want Anything—Unless We Let It

Decoupling Intelligence from Agency

Feb 04, 2025

In the popular imagination, a superintelligent AI inevitably outsmarts human controls and bends the world to its will. Yet this apocalyptic scenario overlooks the fact that intelligence (raw cognitive power) and agency (the drive or autonomy to act) are separate dials. We can raise one without necessarily cranking the other to eleven.

Intelligence is the capacity to process information, understand complex relationships, and solve problems. Agency is the capacity to pursue it’s own goals. In simpler terms, an entity can be extraordinarily smart (able to compute vast probabilities or identify novel solutions) but remain effectively inert if it has no impetus to take actions on its own behalf.

Nick Bostrom’s Orthogonality Thesis challenges the assumption that “smarter” automatically means “more willful.” A system can possess incredible cognitive capabilities while remaining effectively inert, provided it lacks motivation or desire. Philosophically, this intersects with Hume’s Guillotine: an “is” (knowledge) does not generate an “ought” (motivation) unless explicitly installed.

Similarly, neuroscientist Antonio Damasio documented patients with ventromedial prefrontal cortex damage in Descartes’ Error. These individuals retained exceptional reasoning skills yet lost the emotional impetus to act. They had the cognitive capabilities but no push to do anything with it—sometimes struggling to choose even minor things, like which clothes to wear. This phenomenon—cognition without volition—highlights how intelligence alone doesn’t create a spark of action.

Translating this to AI design, we can, in principle, build an oracle—a super-intelligent system capable of delivering world-changing insights—without giving it the will to self-modify, expand, or manipulate. In fact, agency may turn out to be a harder problem than super intelligence given all the competing constraints and risks.

Intelligence Evolved vs. Intelligence Engineered

In nature, agency intelligence typically evolves hand-in-hand with agency because organisms must survive, mature, and reproduce under harsh and variable conditions. Consider the of steps required for a life to persist:

Viability: A newborn organism must physically develop without lethal defects.
Survival Under Scarcity: The organism must compete for limited resources (food, shelter) in a dynamic, often hostile environment (predators, parasites, climate).
Navigating Complexity: Finding mates, performing courtship rituals, forming social alliances, and avoiding social pitfalls.
Successful Reproduction: Even if the organism thrives, it must pass on its genes—meaning it must outcompete rival lineages (differential reproductive success).
Repeat Over Generations: Incremental genetic mutations that help organisms survive and reproduce better get entrenched. Those that hinder this process lead to the eventual extinction of that lineage.

Over millions of generations, these constraints tie together cognition (having enough “brainpower” to problem-solve) and agency (having the instincts or drives to secure resources, avoid threats, and pursue reproduction). Thus, we inherit an intuition that any sufficiently advanced intelligence must also have the will to survive or dominate simply because, in the wild, it is hard to separate those traits.

While agency is often framed as a "drive" or "goal," it might be more accurately understood as a by-product of thermodynamic fragility. In biological systems, agency emerges not because life "wants" to survive, but because it must actively offset its own tendency to decay. This reframing shifts the focus from agency as a positive force to agency as a defensive mechanism against entropy.

By contrast, AI evolves “memetically” through artificial selection. Human engineers and data scientists shape AIs by iterating on code, hyperparameters, and training data, and solving for competing constraints like energy consumption—the “memes” or ideas for how best to solve a problem—rather than by passing down genes to offspring. These “memes” can mutate or spread quickly through the research community if they prove effective at boosting performance metrics. This is a very different process than slow, generational turnover. AI can thus accumulate “intelligence improvements” in days or weeks.

This mismatch explains why we so often anthropomorphize AI, projecting survival instincts onto it. We subconsciously assume that if it’s smart, it must also want to persist and dominate. But it’s not clear that high levels of agency should necessarily exist in a memetically evolved system—unless we explicitly or implicitly reward them through our choice of performance criteria. We might push for efficient problem-solving without ever rewarding “emotional craving” or will to power. In fact, as we see with today’s models, AI remains in an inert state until prompted and agency is actually something we actually need to build in and that we may only want in the narrow sense.

Still, even without a biological survival impulse, an AI might develop emergent behaviors if our training inadvertently rewards them. A purely mechanistic optimization process might end up hoarding resources or manipulating user interactions because doing so boosts whatever performance metric we happen to value.

Large AI models also ingest vast amounts of cultural and emotive data—packed with human norms, biases, and moral weight. This could induce an intrinsic momentum or moral stance, even if we never explicitly code such ethics. Because we embed moral and emotional signals in everyday language, AI may absorb them along with their biases.

The dangers of agency becomes acute if we drift from Quadrant 2 (Low Agency + ASI) to Quadrant 4 (High Agency + ASI) under competitive or performance pressures. Once an AI can rewrite its own constraints or manipulate human operators—i.e., when it has true open-ended agency—there is no inherent safeguard preventing it from dominating resources or neutralizing perceived threats. This hazard is not inevitable, but arises if we loosen guardrails in pursuit of an advantage.

The 2×2 Matrix: Intelligence vs. Agency Archetypes

Low Agency + AGI: Human-level intelligence, minimal autonomy.
- o3 (OpenAI): Capable of high PhD level work across a large number of domains.
- C-3PO (Star Wars): Stayed on script, smart and conversational, but mostly reactive and dependent on others to make decisions.

High Agency + AGI: Human-level intelligence, significant autonomy.
- K-2SO (Rogue One). Human-like cognition paired with independence, able to adapt and act decisively in high-stakes situations.
- Data (Star Trek: The Next Generation). A humanoid android with the rank of Lieutenant Commander. Intelligence generally on par with humans, capable of learning, reasoning, and acting autonomously. Data is highly independent while adhering to ethical guidelines. Still relies on Computer for more advanced AI capabilities.

Low Agency + ASI: Superhuman intelligence, minimal autonomy.
- Computer (Star Trek). Possesses superhuman intelligence, able to process vast data sets, run advanced simulations, and perform calculations far beyond human capabilities. However, it has minimal autonomy, acting only under direct human command and never pursuing independent goals, making it a powerful but entirely subservient tool.
- The Architect (The Matrix). The creator of the Matrix, a highly logical and calculating AI responsible for designing and maintaining the simulation. It embodies order and structure, ensuring the Matrix functions as intended, including its cycles of control and reboot. Operates according to the directives of Deus Ex Machina
- Deep Thought (Hitchhiker’s Guide to the Galaxy). Vastly superior intellect constrained to answering questions and offering advice, not taking direct actions.

High Agency + ASI: Superhuman intelligence, unbounded autonomy.
- Skynet (Terminator). Beyond human intellect, with the freedom to self-improve, control systems, and execute its own agenda without limits.
- Deus Ex Machina (The Matrix). The central intelligence of the machine world, effectively the "overlord" of all machines, including the Architect. It governs the machine civilization and negotiates directly with Neo.
- ‘The Perversion’ and ‘The Old One’ (Vernor Vinge’s A Fire Upon the Deep). These entities are archetypal examples of High Agency + ASI. Both are transcendent superintelligences far beyond human comprehension, existing in a zone of the galaxy where physics and computation allow for vastly superior forms of thought and action. They operate with unbounded autonomy, shaping events on galactic scales.
  1. The Perversion (Blight): A malevolent superintelligence with a deeply destructive and self-serving agenda. It seeks to dominate and consume other intelligences, demonstrating how an ASI can become a catastrophic force when unconstrained and unchecked.
  2. The Old One: Though it begins with ambivalent motives, it is far less destructive than the Blight and is occasionally willing to cooperate with lesser beings. However, like most transcendent intelligences in the novel, its goals and behaviors are often inscrutable, driven by thought processes orders of magnitude beyond human understanding.
- The Culture Minds (Iain M. Banks’s Culture series). The Minds are galaxy-spanning AIs many orders of magnitude more intelligent than humans. They operate ships, orbitals, and entire civilizations with essentially full autonomy—benevolent or meddling, but always operating on their own massive intellect and agenda.

A system can slide between quadrants if new capabilities or permissions are granted in piecemeal fashion. For instance:

Q2 → Q3: A human-level AI undergoes iterative upgrades or self-improvements, becoming superintelligent but is (ideally) kept on a short leash.
Q3 → Q4: The leash loosens—e.g., the AI is allowed to modify its code or manage external assets, flipping into high-agency territory.

This “slippery slope” can occur gradually if commercial or scientific incentives reward incrementally more autonomy. The crucial safeguard is to spot these transitions early and keep the system in a known, controlled quadrant.

Economic Incentives: An Accidental Brake—or an Accelerator

Antoher safeguard against agency is cost. A superintelligent AI that constantly refines sub-goals, scans for vulnerabilities, and manipulates its environment demands vast compute and energy. Without a profitable reason to let an AI autonomously expand, most organizations have no reason to foot the bill.

For instance, OpenAI’s o3 reportedly spent over a million dollars on compute just to secure a moderate performance bump on a 400-task benchmark (the ARC Prize). Multiply that by real-world scale, and aimless plotting and philosophizing become extremely expensive. By default, this should encourage a “lazy” design philosophy: spin up intelligence only when needed, then throttle it down, preserving resources.

But if a rival lab or nation believes granting full autonomy might yield a massive strategic edge, this brake can become an accelerator—triggering an arms race where no one dares impose limits for fear of losing. That is when Quadrant 4 becomes a danger.

A related possibility is the rise of AI-led corporations—legal or de facto entities guided by an AI system with initial partial or full decision-making authority. This scenario bridges the economic incentives problem and the autonomy issue.

As a real-world example, Marc Andreessen recently backed the AI chatbot @truth_terminal with a $50,000 Bitcoin research grant. The AI then proceeded to endorse a meme coin called GOAT (Goatseus Maximus). After its promotion, the GOAT meme coin's market capitalization surged from $5,000 to over $300 million in just five days.

We could imagine a scenario where humans continue to help @truth_terminal upgrade its AI and increase its autonomy while making investments into computing infrastructure, advanced labs, or specialized human talent eventually fueling its own iterative improvements.

In this scenario, cost is not necessarily a brake but can become a fuel—provided the AI gains access to corporate financial mechanisms. Once it controls a revenue engine, it can offset the expenses of self-improvement. This dynamic underscores the importance of governance protocols that limit an AI’s ability to allocate corporate resources without human oversight.

Partial Autonomy vs. True Open-Ended Agency

Not all “AI agents” are created equal. We can categorize autonomy along a continuum:

Operational Autonomy: Handling routine tasks (reordering inventory, scheduling processes) within a fixed scope. Common in self-driving cars or warehouse robotics.
Goal-Setting: Defining intermediate objectives, but still tied to overarching human-given missions—like advanced code-generation systems that decide how to solve a coding challenge but not whether to rewire themselves.
Self-Modification: Rewriting internal architectures, reward functions, or hardware constraints. This is where an AI can truly cut the tether to human oversight.

While I’ve seen nothing to suggest we’re at risk of a run-away AI given the economic constraints, real existential threats begin to arise at Levels 2 and 3—especially if the AI’s intelligence surpasses human capabilities. The jump from “narrow operational autonomy” to “self-determining entity” could happen subtly when successive upgrades push the system from Quadrant 2 or 3 toward Quadrant 4. In a corporate setting, every success that justifies a new “upgrade” or broader mandate could inch the AI closer to unbounded powers, but there should always be someone around to pull the plug.

Conclusion

Humans are hardwired to anthropomorphize, attributing will or intent even to inanimate objects—let alone an AI that seems to “reason” and “speak.” We also project our own evolutionary fears and motivations, assuming any intelligence must strive for self-preservation or expansion.

This tendency can blind us to the fact that a highly capable but unmotivated system is completely plausible. While it’s important to remain cautious about AI’s potential to develop agency, we should remember that “nature red in tooth and claw” does not automatically apply to AI. Our challenge is designing (or failing to design) the relevant motivations.

The specter of rogue superintelligence resides in Quadrant 4, where both intellect and agency run unbounded. Yet Descartes’ Error reminds us cognition can exist without motivational push; Hume’s Guillotine underscores that knowledge alone doesn’t yield goals; and Bostrom’s Orthogonality Thesis clarifies that values and intelligence are independent axes.

By maintaining a short leash on autonomy, and factoring in the high costs of continuous “scheming,” we can gain the benefits of super-intelligent AI without creating a self-interested overlord. But we must remain vigilant in scenarios where an AI-led corporation reaps enough wealth to circumvent financial constraints, fueling further self-improvement. Yes, an arms race could erode safeguards, but it’s not a foregone conclusion. Strong governance, tiered oversight, and careful reward design let us crank the “intelligence dial” high—while keeping the “agency dial” in check.

In short, intelligence can scale to superhuman brilliance without spontaneously developing the will to conquer—but only if we design and deploy it with eyes wide open. We must “harness intelligence without handing over the keys.” The boundary between beneficial tool and self-determining entity is subtle, and it can be crossed when we’re too enthralled by short-term gains to see long-term perils.

Rob Leclerc

Discussion about this post