Simulating Living Virtual Worlds

Abstract

Problem: How do you make a game world feel alive — where cities have economies, factions wage wars, NPCs follow daily routines, and the world visibly progresses even without player involvement — while staying within the performance budget of a real-time 3rd person game?

Approach: Survey of techniques from shipped games (S.T.A.L.K.E.R.'s A-Life, Dwarf Fortress, RDR2, Kingdom Come: Deliverance, Crusader Kings 3, Oblivion's Radiant AI, and others), academic research on agent-based simulation and AI LOD, and analysis of architectural patterns (ECS, GOAP, utility systems, behavior trees, HTN planning) used to build these systems.

Findings: The most effective living worlds use a layered simulation approach: full-detail micro-simulation only near the player, abstract macro-simulation for distant regions, and event-generation systems that bridge the two. No shipped game simulates everything at full fidelity — the art lies in knowing what to simulate, what to fake, and how to maintain coherence across the boundary. The "backdrop" constraint actually simplifies the problem: you need consistent outcomes, not accurate processes.

Key insight: A living world is not one giant simulation — it is a hierarchy of simulations at different resolutions, stitched together by coherence rules, where the player's attention is the camera and everything outside the frame can be a convincing approximation.

What Has Been Done

The dream of a living game world has been pursued for decades, from Ultima VII's NPC schedules in 1992 to S.T.A.L.K.E.R. 2's contested A-Life 2.0 in 2024. Each game has carved out a different slice of the problem space. Understanding what exists is the foundation for understanding what's possible.

Schedule-Driven Worlds

The simplest form of a "living world" is NPC scheduling — giving characters daily routines that create the illusion of independent life. This approach traces back to Ultima VII (1992), where NPCs would wake up, eat breakfast, go to work, visit taverns, and sleep, all on an in-game clock.

Bethesda's Radiant AI (The Elder Scrolls IV: Oblivion, 2006) was one of the most ambitious attempts at schedule-driven NPC behavior. Each of Oblivion's ~1,000 NPCs was given a package of AI behaviors tied to time-of-day schedules. An NPC might have a "sleep" package from 10 PM to 6 AM, an "eat at inn" package for morning, and a "tend shop" package during business hours. The system was originally designed with a goal-oriented component where NPCs would figure out how to satisfy needs (a hungry peasant might buy food, or steal it, or hunt), but most of this emergent behavior was scaled back before release because NPCs were doing things like killing each other for potions. What shipped was largely a scheduling system with some light decision-making. As one Elder Scrolls designer later explained, the system represented an "enormous" investment, but each improvement paradoxically made it less noticeable to players — when NPCs behave realistically, you stop noticing the AI entirely.

The Witcher 3 (2015) uses a similar schedule approach for background NPCs in Novigrad and other settlements. Villagers work fields during the day and gather in taverns at night. Merchants open and close shops. The system creates an impressive atmosphere of a living city but doesn't simulate deep NPC autonomy — these are primarily aesthetic schedules rather than need-driven behaviors.

Kingdom Come: Deliverance (2018) pushed NPC scheduling further into simulation territory. All NPCs — not just named quest-givers but every villager and guard — have daily routines and react to player actions persistently. If you steal from someone, they'll remember. If you kill a shopkeeper, they stay dead and their shop stays closed. Warhorse Studios' GDC talk "Supporting Thousands of Simulated NPCs" revealed that KCD2 quadrupled the NPC count to nearly 2,400, with about half concentrated in a single city. To handle this, they developed AI LOD (level of detail for simulation) — a concept we'll return to extensively.

Ecosystem Simulation

Red Dead Redemption 2 (2018) represents the AAA gold standard for micro-level world simulation. Over 200 animal species follow simulated food chains — coyotes hunt prey but flee from larger animals, carcasses decay and attract vultures, weather affects animal behavior. NPCs have multi-dimensional personality states: they remember past interactions, respond to the player's appearance (covered in mud vs. well-dressed), and follow daily routines driven by their occupations. A construction worker wakes at dawn, drinks at the saloon, works at a building site, cleans up sawdust, returns to the saloon. Rockstar used 1,200 actors and wrote 80-page character scripts. But crucially, NPC goals and routines in RDR2 are fixed — the construction worker will never decide to become a rancher because of emergent circumstances. The world feels alive but doesn't truly evolve.

Agent-Based Simulation: S.T.A.L.K.E.R.'s A-Life

GSC Game World's A-Life system, introduced in S.T.A.L.K.E.R.: Shadow of Chernobyl (2007), remains one of the most ambitious attempts at genuine world simulation in a 3D action game. A-Life manages every NPC and creature in the game world as an autonomous agent with goals, faction allegiances, and survival needs.

The system operates on two planes. Online A-Life governs entities within approximately 150 meters of the player — these NPCs run full behavioral AI with pathfinding, combat, and interaction. Offline A-Life manages everything else as an abstract simulation on a graph. The game engine tracks offline NPCs on what lead programmer Dmitriy Iassenev called the "detailed graph" — a network of nodes representing locations in the Zone. Offline NPCs move between nodes, engage in abstract combat (resolved with dice rolls based on faction strength and equipment), and the results are reflected when the player arrives. A battle you heard distant gunfire from 20 minutes ago will have left bodies, new faction control, and shifted patrol routes.

Smart Terrains layer behavioral rules over geographic zones. A campfire area might cause NPCs to sit and play guitar. A military checkpoint triggers guard behavior. An anomaly field forces avoidance pathfinding. These zones give contextual meaning to NPC presence without scripting every interaction.

S.T.A.L.K.E.R. 2 (2024) attempted A-Life 2.0 but ran into severe optimization problems. As GSC Game World explained, "This system to work properly requires a much larger area for spawn NPCs, and it requires much more memory resources." They had to shrink the active simulation radius significantly, essentially compromising the core promise. This is the fundamental tension: simulation fidelity vs. computational budget.

Macro-Scale Simulation

Crusader Kings 3 (2020) demonstrates macro-level character simulation at continental scale. The game simulates thousands of characters — each with traits, relationships, opinions, schemes, and ambitions — across centuries of game time. Characters form alliances, plot assassinations, fall in love, develop alcoholism, and wage wars based on their personality traits and relationships. The simulation produces genuinely emergent stories: a skilled but cruel duke might be assassinated by his own wife who's been secretly plotting for years.

CK3's optimization approach is instructive. Paradox introduced tier-based frequency scheduling: a Baron in Iceland doesn't need to evaluate available diplomatic interactions as frequently as the Emperor of China. The game adapts AI evaluation frequency based on character importance, reducing computation for minor characters while maintaining full simulation for major political players. This is simulation LOD applied to political actors rather than spatial proximity.

Mount & Blade: Bannerlord (2022) simulates kingdom-level warfare and economics. Lords lead armies across a world map, besieging castles, raiding villages, trading goods between cities. Villages produce resources, cities consume them, and caravans carry trade between them. When a village is raided, its production drops, affecting the city it supplies, raising food prices, and potentially causing prosperity decline. The economic model isn't deep enough to create stock market crashes, but it produces believable cause-and-effect chains: wars in one region create trade opportunities in another. Bannerlord's world simulation runs regardless of the player — kingdoms declare war, lords defect, cities change hands — making the player feel like a participant in a living world rather than its center.

Deep Simulation: Dwarf Fortress

Dwarf Fortress (2006–present) is the benchmark for simulation depth. Built over 20+ years by a single developer (Tarn Adams) in ~700,000 lines of C/C++, it simulates individual dwarves with emotional states, favorite materials, social relationships, memories of traumatic events, artistic preferences, and grudges. The world generation creates entire civilizations with histories, mythologies, and named artifacts before the player even begins.

The famous "drunken cat bug" illustrates the simulation's depth and emergent nature: cats were dying in taverns because they walked through spilled alcohol, cleaned their paws (ingesting the alcohol), and went through the full alcohol poisoning symptom chain (originally implemented for venomous creatures). One wrong number in the ingestion code created an epidemic of alcohol-poisoned cats — a chain of interactions across five different systems that no designer explicitly created.

Dwarf Fortress achieves this by brute force. Everything is single-threaded. There is no LOD — every entity gets full simulation every tick. Pathfinding uses A* with connected-component tracking to avoid failed searches. The result is extraordinary emergence at the cost of performance: simulation grinds to a halt with a few hundred active entities. Adams has acknowledged he's "at the edge of what we can currently support in terms of agents and map complexity."

Director and Storyteller Systems

RimWorld (2018) takes a different approach: rather than simulating a world, it uses an AI Storyteller to curate one. The storyteller selects events (raids, resource drops, disease outbreaks) based on colony wealth, population, and time — creating dramatic arcs rather than simulating them. Colonists ("pawns") have needs, skills, relationships, and mental states, but the macro-level drama comes from authored event generation rather than emergent simulation. This is closer to a game master than a simulation.

GTA San Andreas (2004) implemented gang territory as a simplified faction simulation. The map is divided into colored zones controlled by different gangs. Territory changes hands through combat, and enemy gangs periodically attack player-controlled areas. It's a state machine rather than a true simulation, but it produced a compelling sense of ongoing factional struggle.

Micro-Level Simulation

Micro-level simulation deals with individual entity behavior: an NPC's daily life, moment-to-moment decisions, and interactions with other entities. Getting this right is what makes a player look at a tavern scene and think "these people have lives."

Needs and Drives

The most robust approach to NPC autonomy is need-driven behavior, popularized by The Sims (2000). Each NPC has a set of needs (hunger, rest, social, entertainment) that decay over time. The NPC selects actions that satisfy their most pressing need. This creates naturally varied behavior — a well-fed NPC socializes; a hungry one seeks food — without scripting every permutation.

For a backdrop simulation, needs can be simplified dramatically. A three-tier system works:

Survival needs (hunger, rest, safety) — these drive basic daily patterns
Economic needs (income, resources) — these drive work behavior and economic participation
Social needs (relationships, faction standing) — these drive group behavior and allegiance shifts

Each need is a single floating-point value that decays at a configurable rate. Actions that satisfy needs are chosen using a utility function that weights urgency against availability.

Routines and Schedules

Daily routines are the backbone of the "living world" illusion. An NPC with a schedule — wake, eat, work, eat, work, socialize, sleep — immediately feels more real than one who stands in place. The key insight from Kingdom Come: Deliverance is that routines should be interruptible and resumable: if a guard is eating lunch and hears combat, they should drop lunch and respond, then potentially return to eating afterward.

Schedule implementation typically uses a priority stack:

Emergency behaviors (combat, fleeing) — highest priority, override everything
Reactive behaviors (greeting player, reacting to events) — triggered by proximity/events
Scheduled behaviors (work, eat, sleep) — driven by time of day
Idle behaviors (wandering, sitting) — fallback when nothing else applies

Relationships and Memory

Meaningful NPC relationships require persistent state. At minimum, each NPC needs:

Disposition toward the player — a numeric value modified by interactions
Disposition toward other NPCs — enables inter-NPC drama
Memory of significant events — "you killed my friend," "you saved my village"

RDR2 demonstrates that even limited memory creates powerful immersion. If you've been rude to a shopkeeper, they remember. If you've helped a stranger on the road, you might encounter them later and receive thanks. These are simple state flags but they create a feeling of consequence.

For a backdrop simulation, relationship tracking can use a sparse matrix: only store relationships that have been established through interaction, defaulting to faction-based dispositions for everyone else.

Macro-Level Simulation

Macro-level simulation deals with systems that operate above individual NPCs: economies, faction power, territory control, political structures. These create the sense that the world has forces larger than any individual.

Economic Models

Game economies range from simple (Mount & Blade's resource-flow model) to deeply interconnected. For a backdrop simulation, a node-based economic model works well:

Production nodes (farms, mines, factories) generate resources over time
Consumption nodes (cities, military camps) consume resources
Trade links connect nodes with capacity and transit time
Price discovery through supply/demand: scarce goods cost more

This creates emergent economic behavior: if a mine is destroyed by raiders, metal prices rise in connected cities, armor becomes expensive, military recruitment slows. None of this needs to be scripted — it falls out of the model.

Victoria 3 (2022) by Paradox represents the state of the art in game economic simulation, modeling individual goods markets, population groups with purchasing power, and trade routes between regions. While too complex for a real-time backdrop, its key insight is useful: economic actors should have heterogeneous interests. Not everyone benefits from the same economic conditions. A weapons merchant benefits from war. A farmer benefits from peace. This creates natural factional tensions.

Faction Dynamics

Factions are the macro-level equivalent of NPCs — they have goals, resources, relationships with other factions, and internal politics. A useful faction model includes:

Power score — military strength, economic resources, territorial control
Relations matrix — how each faction views every other faction
Goals — territorial expansion, economic dominance, survival, ideology
Internal cohesion — how unified the faction is (can it split? can members defect?)

Faction behavior can be modeled with utility-based AI: each tick, a faction evaluates available actions (declare war, propose alliance, raid territory, recruit) and selects the one with highest expected utility given its goals and current state. Crusader Kings 3 demonstrates that this produces rich emergent politics when combined with character-driven motivations — a faction might go to war not because it's strategically optimal but because its leader hates the rival faction's ruler.

Territory Control

Territory is the spatial manifestation of faction power. A useful model divides the world into discrete zones, each with:

Controlling faction — who holds it
Influence scores — per-faction presence (enables gradual takeover)
Strategic value — economic/military importance
Population sentiment — loyalty to the current controller

Territory changes hands through a combination of military action and influence shifting. GTA San Andreas's model is the simplest version: territories are colored zones that flip through combat. A more nuanced model uses influence curves: a faction's influence in a zone grows when they have forces nearby and decays when they don't. When influence crosses a threshold, control flips. This creates contested border zones, gradual expansion, and realistic "frontlines."

Bridging Micro and Macro

The hardest problem in world simulation is connecting individual actions to systemic effects and vice versa. How does one merchant being murdered affect the city's economy? How does a war between kingdoms change individual NPC behavior?

Upward Causation: Micro to Macro

Individual events must aggregate into macro-level changes. Approaches:

Statistical aggregation: Don't track individual transactions — track aggregate flows. If 30% of a city's merchants are killed during a gang war, reduce the city's trade output by 30%. This avoids the expensive N-body problem of tracking every interaction.

Event emission: Individual events emit signals that macro systems listen to. An NPC death emits a {death, location, faction, cause} event. The faction system aggregates death events to update power scores. The economic system notices missing economic actors. The territory system notes shifting influence. Each listener processes events independently without knowing about the others.

Threshold triggers: Track cumulative micro-level changes and fire macro-level events when thresholds are crossed. If faction violence exceeds a threshold in a district, trigger a "gang war" macro-event. If food supply drops below a threshold, trigger a "famine" event. This converts continuous micro-activity into discrete macro-events that are easier to reason about and more dramatic for the player.

Downward Causation: Macro to Micro

Macro-level states must influence individual behavior. This is often simpler:

Environmental modifiers: Macro states modify the parameters of NPC behavior. During a war, NPC aggression increases. During a famine, NPCs prioritize food-seeking. During a police crackdown, criminal NPCs lay low. These are global or zone-based modifiers applied to NPC decision-making.

Spawning rules: Macro states control what entities exist. A prosperous district has more merchants and fewer beggars. A war zone has soldiers, refugees, and damaged buildings. Rather than simulating every transition, you adjust the spawning profile when a zone's state changes.

Event injection: Macro events create individual events. When a faction declares war, specific NPCs receive "mobilize" orders. When a market crashes, specific merchants go bankrupt. This creates visible, personal consequences of abstract systemic changes.

Optimization: Simulation Level of Detail

This is the section that makes or breaks a living world in a real game. You cannot simulate 10,000 NPCs with full behavioral AI at 60 FPS. The solution is simulation LOD — analogous to graphical LOD, but for behavior and state updates.

The LOD Hierarchy

Drawing from S.T.A.L.K.E.R.'s Online/Offline split, Kingdom Come: Deliverance's AI LOD system, and academic research on simulation LOD for virtual characters, a practical hierarchy has four tiers:

LOD 0 — Full Simulation (within ~50m of player): NPCs run full behavioral AI with pathfinding, animation, perception, interaction capability, detailed decision-making. This is what the player actually sees and interacts with. Budget: ~50-100 entities.

LOD 1 — Simplified Simulation (within ~200m): NPCs exist in the world but use simplified pathfinding (waypoint-following instead of navmesh), reduced perception, lower update frequency (every 2-4 frames instead of every frame). They're visible at a distance but don't need full behavior. Budget: ~200-500 entities.

LOD 2 — Abstract Simulation (same region/district): NPCs exist as data entries — position, state, schedule — but have no physical presence. Their state advances based on schedule lookup and simplified rule evaluation. When a player approaches, they're spawned at a plausible position derived from their schedule. Budget: ~2,000-5,000 entities.

LOD 3 — Statistical Simulation (distant regions): Individual NPCs aren't tracked. Instead, populations are modeled as statistical distributions. A district has "450 civilians, 30 merchants, 20 guards, faction influence: 60% Blue / 40% Red." When the player enters the region, individual NPCs are generated from these statistics. Budget: unlimited regions.

The Critical Transition: LOD 2 ↔ LOD 0

The most important transition is when an abstractly-simulated NPC becomes fully realized. This requires:

Schedule consistency: When an NPC materializes, their position and activity must match their schedule. If it's noon, the blacksmith should be at the forge, not in bed. S.T.A.L.K.E.R.'s approach of advancing abstract state along a graph works well — fast-forward the NPC's schedule to the current time and place them accordingly.

State consistency: If the NPC was involved in abstract events (combat, trade, injury), those results must be reflected. An NPC who abstractly "fought off raiders" should have reduced health and damaged equipment when materialized.

Temporal consistency: The player shouldn't see NPCs pop into existence. Use spawn points outside the player's line of sight (building interiors, around corners, beyond fog of war) and have NPCs "arrive" at locations rather than appear.

Ticking Strategies

Not every system needs to update every frame. A tiered tick system reduces computation:

Per-frame (16ms): Only player-proximate entities and critical systems
Low-frequency (100-500ms): LOD 1 entities, economic updates, faction evaluations
Periodic (1-5 seconds): LOD 2 schedule advancement, influence calculations
Event-driven: Many systems only need updating when something relevant happens — don't poll, push

Crusader Kings 3's approach of tier-based frequency is directly applicable: important entities (faction leaders, quest-relevant NPCs) tick more frequently than minor ones. A city mayor evaluates their options every game-hour. A random peasant evaluates once per game-day.

Spatial Partitioning

Divide the world into regions/districts/cells. Each cell has a simulation budget. Cells near the player get more budget. Distant cells run abstract simulation. This is the approach used by most open-world games — Bethesda's cell system, S.T.A.L.K.E.R.'s level graph, KCD's region-based LOD.

Spatial hashing or octrees enable fast lookup of "which entities are near the player" for LOD transitions. The key is making these transitions invisible — preload cells ahead of the player's movement direction and begin spawning entities before the player can see the boundary.

Procedural Events

A living world isn't just NPCs following routines — it's a world where things happen. Gang wars erupt, market crashes occur, political coups unfold. These events must emerge from simulation state to feel authentic rather than random.

Event Generation from State

The most robust approach is monitoring simulation state for conditions that should produce events:

Precondition scanning: Define events as templates with preconditions. A "gang war" event requires: two factions with negative relations occupying adjacent territory, combined military strength above a threshold, and a trigger (assassination, territorial incursion, resource dispute). The event system continuously scans for met preconditions and fires events when found.

Tension accumulation: Track per-region "tension" values for different event types (violence, economic instability, political unrest). Micro-events increase tension. When tension exceeds a threshold, a macro-event fires and tension resets. Add randomness to the threshold to avoid predictability.

Story sifting: A technique from academic research (used in limited form by The Sims 2's "story trees"). An interpretation engine scans the history of simulation events, looking for patterns that match narrative templates. If it detects "Character A harmed Character B, then Character B gained power, then Character B encountered Character A again" — it recognizes a revenge narrative and can highlight or escalate it.

RimWorld's Storyteller Approach

RimWorld's AI Storytellers provide an alternative model: rather than letting events emerge purely from simulation, a meta-system curates the pacing. The storyteller tracks:

Colony wealth (determines threat scaling)
Recent event history (avoids repetition, ensures variety)
Dramatic arc (builds tension, provides relief)
Population (adjusts recruitment opportunities)

This produces better-paced narratives than pure simulation, which can be either boringly stable or catastrophically chaotic. A hybrid approach works well for backdrop simulation: let events emerge from state, but filter them through a storyteller that ensures good pacing and avoids overwhelming the player.

Making Events Visible

An event that happens but isn't perceived by the player is wasted computation. Events need delivery mechanisms:

Environmental changes: A gang war leaves graffiti, damaged buildings, NPC corpses, shifted patrol routes
NPC dialogue: Characters reference recent events ("Did you hear about the market crash?")
News systems: In-game newspapers, radio broadcasts, bulletin boards — information delivery that feels diegetic
Ambient audio: Distant gunfire, crowd panic, celebration sounds
Map changes: Territory colors shift, new checkpoints appear, roads become dangerous

The Backdrop Approach

For a 3rd-person game where world simulation serves as context rather than the core gameplay, you don't need to simulate everything. You need to simulate enough that the world feels alive and produces interesting state changes. The rest is theatrical illusion.

What Needs to Be Real

Certain aspects must be genuinely simulated because the player can observe them directly:

Nearby NPC behavior (within view distance): Must be convincing, responsive, contextually appropriate
Faction territory (visible on map): Must change based on actual events, not arbitrary timer
Economic effects (prices, availability): Must respond to world events coherently
World state persistence: If the player causes something, it must stick

What Can Be Faked

Much of the world's apparent depth can be theatrical:

Crowd simulation: Background NPCs in cities don't need individual identities. A crowd spawner that generates contextually-appropriate NPCs (merchants in market districts, drunks near taverns, guards near government buildings) with simple ambient behaviors creates the illusion of population without per-entity simulation.

Distant events: A gang war in a district the player isn't visiting doesn't need to be simulated combat-by-combat. Roll for outcomes based on faction strength, apply results, and present the aftermath when the player arrives. As Paul Chiusano observed about simulation LOD: "All I can say for sure is that when I wake up in the morning, I'll observe the clock in a state that is consistent with it having been ticking at a continuous rate all night."

Historical consistency: Rather than maintaining full history, generate plausible history on demand. When a player asks an NPC about recent events, generate contextually-appropriate dialogue based on current world state, not a replay of simulated events.

Trade and commerce: Individual trade transactions don't need simulation. Model aggregate flows between regions and display the results (caravan traffic, market inventory, price levels).

Coherence Rules

The key challenge of the backdrop approach is maintaining coherence — ensuring the faked parts are consistent with the real parts. This requires:

State reconciliation: When transitioning from abstract to detailed simulation, verify the generated state is consistent with known facts. If the player knows a specific NPC was alive yesterday, don't generate a world state where they died offscreen (unless that death was explicitly simulated).

Causal consistency: Effects must have plausible causes. If food prices rise, there should be a visible reason (war, drought, trade disruption). The system should be able to explain any state change through a chain of causes.

Temporal consistency: Changes should happen at plausible rates. A faction shouldn't conquer a continent overnight. An economy shouldn't collapse in an hour (unless something dramatic happens). Rate-limiting macro changes prevents incoherent jumps.

Architecture Patterns

Entity-Component-System (ECS)

ECS separates entities (identity), components (data), and systems (logic). For world simulation, this pattern excels because:

Composition over inheritance: An NPC can have {Position, Schedule, FactionMembership, Inventory, Health} components. A building can have {Position, EconomicOutput, StructuralIntegrity}. Systems process any entity with relevant components.
Data-oriented design: Components stored contiguously in memory enable cache-friendly iteration. When updating positions for 5,000 entities, you iterate a contiguous array of Position components rather than chasing pointers through an object hierarchy.
LOD as component presence: An LOD 0 entity has {FullAI, Animation, Physics, Perception} components. An LOD 2 entity has only {AbstractState, Schedule}. LOD transitions add/remove components.

Dwarf Fortress's creator Tarn Adams noted that while he didn't start with ECS, he moved toward a component-like "tool item" approach over time for flexibility — enabling a single item type to act as "anything from a stepladder to a beehive to a mortar."

Goal-Oriented Action Planning (GOAP)

Pioneered by Jeff Orkin for F.E.A.R. (2005), GOAP gives NPCs goals and a library of actions with preconditions and effects. A planning algorithm (typically A*) finds action sequences that achieve goals. F.E.A.R.'s entire combat AI ran on a 3-state finite state machine (GoTo, Animate, UseSmartObject) with GOAP determining which states to enter in what order.

GOAP excels for NPCs that need to solve novel problems. If a guard's goal is "prisoner secured" and available actions include "lock door," "patrol corridor," and "call reinforcements," GOAP can chain these without explicit scripting. However, GOAP is computationally expensive for large populations and has been increasingly replaced by HTN (Hierarchical Task Network) planning in recent titles like Killzone 2, Max Payne 3, and Dying Light.

Utility Systems

Utility AI (as championed by Dave Mark's Infinite Axis Utility System) scores all available actions on multiple axes and selects the highest-scoring one. Unlike behavior trees (which are priority-ordered) or GOAP (which is goal-oriented), utility systems produce nuanced decisions that weigh competing concerns smoothly.

For backdrop simulation, utility AI is ideal for NPC daily life. An NPC evaluates: How hungry am I? How tired? How much do I need money? How dangerous is my current location? Each concern maps to a response curve, and the system selects the action with the highest combined utility. This creates naturally varied behavior without hand-authored decision trees.

Behavior Trees

Behavior trees (popularized by Halo 2) organize NPC behavior as a tree of tasks evaluated top-down. They're the industry standard for real-time NPC AI due to their predictability, debuggability, and artist-friendly visual editing. For backdrop NPCs at LOD 0 (near the player), behavior trees remain the pragmatic choice for moment-to-moment behavior. For LOD 2+ entities, they're overkill.

Blackboard Systems

A blackboard is a shared data store that multiple AI systems read from and write to. For world simulation, a hierarchical blackboard architecture works well:

Global blackboard: World state, time of day, active macro-events
Region blackboard: Local faction control, economic state, danger level
Entity blackboard: Individual NPC state, goals, recent events

AI systems at all levels read from relevant blackboards to make contextual decisions without tight coupling between systems.

Recommended Architecture: Layered Hybrid

No single pattern handles all layers of world simulation. A practical architecture combines:

ECS as the foundational entity management system
Utility AI for macro-level faction and economic decisions
GOAP or HTN for named/important NPC decision-making
Behavior trees for LOD 0 NPC moment-to-moment behavior
Schedule tables for LOD 2+ NPC state advancement
Event bus for cross-system communication (micro events → macro listeners)
Blackboard hierarchy for shared state access

What Could Be Done

LLM-Driven NPCs

Stanford's 2023 "Generative Agents" paper demonstrated that LLM-powered NPCs can produce believable individual and emergent social behaviors. Twenty-five agents in a Sims-like town autonomously organized a Valentine's Day party starting from a single agent's stated intention — coordinating invitations, decorations, and scheduling through natural conversation. Human evaluators rated the LLM agents' behavior as more human-like than actual humans role-playing the same characters.

The implications for world simulation are significant but the constraints are severe. Current LLMs require ~100ms+ per inference on GPU hardware, making per-frame NPC decisions infeasible. However, LLMs could power:

Periodic decision-making for important NPCs (once per game-hour, evaluate life situation and goals)
Dialogue generation for player interactions (replacing branching dialogue trees)
Event narration (generating news reports, rumors, NPC commentary about world events)
Storyteller AI (replacing hand-coded event selection with contextual narrative generation)

The Stanford Generative Agents architecture offers a template: each agent has a memory stream (observations), reflection (periodic high-level reasoning about memories), and planning (daily and hourly action plans generated from reflections). This maps cleanly onto backdrop simulation — run reflections infrequently, plan daily, execute plans with traditional AI.

Hybrid Simulation-LLM Architecture

The most promising near-term approach combines deterministic simulation with selective LLM augmentation:

Simulation layer: Traditional systems handle economics, faction power, territory control, NPC scheduling — anything that needs to be fast, deterministic, and scalable
LLM layer: Handles natural language interaction, high-level NPC decision-making for key characters, event narration, and storyteller pacing
Bridge layer: Translates between structured simulation state and natural language context for the LLM, and converts LLM outputs into simulation actions

Advanced Approaches

Multi-resolution agent-based modeling: Academic simulation research uses multiple resolution levels within a single model — individual agents at fine scale, statistical populations at coarse scale, with formal rules for aggregation and disaggregation. This is the theoretical foundation for simulation LOD.

Predictive simulation: Instead of simulating the future continuously, generate predictions of future state based on current trends and simulate only when predictions are invalidated by player action or random events. A city in peacetime can be predicted forward months in game-time instantly; only intervention or low-probability events force actual simulation.

Social practice modeling: The Versu system (by Richard Evans of Sims 3 fame and Emily Short) models social interactions through formalized social practices — not just "NPC talks to NPC" but structured social rituals with roles, expectations, and possible violations. This creates more textured social behavior than utility-based action selection.

Learned simulation: Train neural networks to approximate expensive simulations. A lightweight network trained on thousands of full-fidelity simulation runs could predict "what would this city's economy look like after 100 game-days of war?" in milliseconds rather than computing it step-by-step. This is simulation LOD through learned approximation.

The Design Target

For a 3rd-person game with a living world backdrop, the practical target is:

~100 fully-simulated NPCs near the player with behavior trees and full AI
~2,000 abstractly-simulated NPCs across the current region with schedule-based state advancement
~20 factions with utility-based decision-making updated every game-hour
~50 economic zones with node-based resource flow updated every game-minute
~10 active macro-events (wars, economic shifts, political changes) evolving based on simulation state
1 storyteller system curating event pacing and ensuring dramatic progression

This is achievable on current hardware. The challenge isn't computational — it's design. The hard problem is making all these systems produce coherent, interesting, and perceivable world state changes. A simulation that produces fascinating emergent behavior that the player never notices is a waste of engineering. The living world must be performed for the player, not just computed in the background.

The world doesn't need to be alive. It needs to feel alive. And then, ideally, when the player pushes on it — investigates, intervenes, disrupts — it should push back with coherent consequences that reward their attention. That's the design goal: a world that rewards curiosity with depth, even if that depth is, in many places, a convincing illusion.