Abstract
Problem: How do game developers learn how players are actually playing their game so they can balance it and fix issues?
Approach: Tim Cain describes the telemetry systems he used at Carbine Studios (WildStar), Obsidian (Pillars of Eternity), and reflects on what he wished he'd had on Fallout β covering the server infrastructure, packet design, and analysis techniques.
Findings: Effective telemetry requires a dedicated team, a server receiving timestamped/location-tagged event packets, and thoughtful analysis via heat maps and "moments of interest" reports. The key is working backwards: decide what responses you'd make to problems, which tells you what analysis you need, which tells you what data to collect.
Key insight: Your expectations about how the game should be played drive the entire telemetry pipeline β from what data you collect, to how you analyze it, to what fixes you make.
The Telemetry Server
The telemetry server is a machine at the office or data center. When a game client starts up and connects to the game server, the game server tells the telemetry server to expect packets from that validated client. This verification step prevents spam. At Carbine, they used MySQL to store events, with a database expert named Twain Martin (who also worked at Obsidian) handling the SQL side.
Three Types of Packets
All packets are timestamped and location-tagged. They fall into three categories.
Discrete In-Game Events
The core of telemetry. These fire when the player takes damage, uses an ability, dies, or transfers between maps. Each packet records what happened, where, when, to whom, and with what. For example: "Player took 15 fire damage from trap X at location Y on map Z at time T."
Non-Discrete Events (Movement)
Player movement is continuous, so you send small periodic packets β location, direction, speed β every tenth of a second, quarter second, or full second depending on the heat map density you need. More density means more packets means more storage, so you find a happy medium.
Meta Events
Discrete but non-gameplay events: saving, loading, adjusting options (especially combat difficulty), and quitting. These are critically important. When do players save? When do they reload? When do they lower difficulty? When do they quit? That last one is especially revealing.
Analysis: Heat Maps
The collected data gets aggregated into heat maps β visual overlays on game maps showing density of events.
Tim, being colorblind, made sure heat maps used brightness (the V in HSV color space) rather than relying solely on color, so they'd still be readable in grayscale.
Motion Heat Maps
Show where all players went on a particular map. Two key things to look for: where players spend the most time, and whether they reach places they shouldn't. Tim tells a story from WildStar where a cave entrance at the top of a mountain was intended to be accessible only by flying, but heat maps revealed players had found a jumpable path up the mountainside. They fixed it.
Damage and Death Heat Maps
Aggregate all damage taken or deaths at specific locations. Reveals the most dangerous areas on the map.
Analysis: Moments of Interest
Beyond heat maps, you can generate reports flagging specific thresholds: locations with over 50 deaths per hour, places where players save frequently, spots where players reduce combat difficulty, or β critically β where players quit unexpectedly.
The Tomb Raider Jump Story
Tim shares an anecdotal story (he notes he can't confirm it) about a Tomb Raider / Lara Croft game where telemetry revealed a simple jump that was killing and frustrating players. The ledge curved inward, and players would try to jump early rather than running to the closest point. Adding a small railing guided players to the correct jump point and fixed the problem.
Responses: Working Backwards from Expectations
The most important principle: your expectations drive the whole system.
You expect players to quit in safe places or at story stopping points β so you look for quits happening mid-dungeon next to high-damage areas (rage quits). You expect boss areas to have high death rates β but you check whether certain classes die disproportionately, revealing balance problems. You expect all character builds to progress at roughly the same rate β so you compare XP gain and completion levels across combat, stealth, and dialogue builds.
This backwards reasoning β expected responses β needed analysis β required data β tells you what telemetry to collect in the first place. Teams often discover gaps and go back to add new event types.
Practical Examples
Pillars of Eternity: Skill Usage Tracking
On Pillars of Eternity, they recorded when players used non-combat skills like Athleticism. They wanted to know two things: are all skills used roughly equally, and are they used throughout the game? They found some skills were used more than others, and some were "front-loaded" β used early but rarely later.
Fallout: The Energy Weapons Regret
Tim wishes he'd had telemetry on Fallout. They had an Energy Weapons skill, but no energy weapons dropped in the first third of the game. The skill was completely unusable for a huge portion of the playthrough. Telemetry would have immediately surfaced this problem.
Requirements
Tim emphasizes that telemetry isn't something you bolt on casually. You really need a dedicated telemetry team β people who build and maintain the server, design the packet schemas, write the analysis tools, and generate reports. It's hard to staff, which is why he only used it to a limited extent on some projects.
References
- Tim Cain. YouTube video. https://www.youtube.com/watch?v=d208Uv-aZYU