Saturday, 8 May 2010

A Heated Debate Over Balance: Part One (Rebalancing)

I have this bitter, metallic taste in my mouth, like I've just bitten down on my tongue after being slapped in the face... or drinking meths.

I'm angry. They've nerfed the Pyro and now expect me to be happy with this buff?

Let me back up a bit.

For those of you not aware, there has been some light and fury about the recent changes to the Pyro class in Team Fortress 2. Valve has chosen to decrease the direct damage and after burn effects (damage over time) effects of their primary weapon - the Flamethrower - in return for improving the effects and frequency of use of its alternate fire - which allows you to push enemy players and reflect projectiles using a puff of air (Some irrate players now want the class to be called the Aero). This was released in a patch to the game about a week ago, without comment as to why, as part of a number of changes in this constantly evolving game.

Team Fortress 2 is perhaps unique in commercial games - outside of the MMORPG genre - in that it has received constant tweaks since release in game balance. This isn't just a matter of releasing new content (multiplayer maps), but significant changes to the overall balance of classes - and with the class updates - significant changes to the way some classes play. It is a bold experiment, permitted by the digital distribution technology Steam, and a pioneering glimpse at a way some games can continue to be relevent years after release.

Team Fortress 2 is not the first game to receive a dedicated following - Starcraft, Super Street Fighter 2 Turbo, and Valve's own Counterstrike are the first three examples that spring to mind. What two of these games - Starcraft and Street Fighter - have in common is longevity has been maintained through the incredible balance between the moving parts in the system.

Unfortunately balance isn't something you can just design in: it is very much a matter of incredible luck as it is about deliberate design. If you consider all of the computer games ever made, there is only an extremely limited number which are still being played more than 15 years on from their inception (this holds less true now with change with the recent trend in retrogaming). And it is entirely possible that game designers may not even understand what it is that made their game successful. See this analysis on some of the micro issues in Starcraft 2 to get a feel for the level of accidental design detail that contributes towards game balance.

But there are some principles of game design that hold true when it comes to balance - worth reiterating here:

1. Always balance upwards

The most powerful character in a game is probably almost the most fun - especially when you are playing that character and your opponent isn't. As a result, you should use this as a yard stick for where you buff your other choices up to. But be careful not to overshoot when improving them, because:

2. It is much harder to scale down and take away than to scale up and add

Much like cooking, it is much harder to take away ingredients to a game than it is to add them. Particuarly once a game is released, human nature is ingrained against removing functionality and decreasing ability. Programmers have written code, art assets have been designed, people have played with the ability at a particular level. There is nothing like the cry of nerf to rally the forums. But if you are going to take away functionality, be sure to:

3. Change one variable rather than scaling across the board

If you think about balancing a game by playing it, you are really conducting a series of experiments and seeing the results. And science works best when you can vary one value and leave the other variables as fixed as possible. So if you have a character which does three things too well, leave two of them at the level that is overpowered, and reduce the third and then see if that leaves the character balanced against the other choices. Of course, this is easiest when you:

4. Keep your variables independent

The Mutalisk in Starcraft was incredibly hard to balance because it uses the same attack to attack both air and ground units. During testing and post-release patching, if it was too effective against air units, reducing the attack strength would make it ineffective against ground units. And vice versa. Creating two separate attacks would have allowed the game designers to better balance this unit - at the cost of increased art and animation assets required to do so.

5. Statistics and mathematical analysis is incredibly important

If you consider playing as a series of experiments, then the other tools of scientific analysis: statistics and mathematical modelling are obviously critical to get an understanding of how the game works. Unangband balances its monsters by modelling a simulation of 10 rounds of combat and seeing how much damage the monster inflicts. This model is as simple as taking the highest of (chance of attack * damage inflicted * 'typical player resistance * 10) of each attack the monster is capable of but works remarkably effectively. And I don't have to get this exactly right because Unangband, like many games:

6. Provide plenty of resets

A reset is a way of either player recovering from a situation by using up an exhaustable resource. Guilty Gear has an incredible cast of characters with bizarre attacks but also has a number of ways of recovering from the effects and traps of any of these attacks. The resets act as a negative feedback mechanism to dampen the risk of any positive feedback interactions between abilities getting out of control. I've written more about resets and traps in my series on designing a magic system.

7. Intuition and fun are the most important mechanism for balance

Balancing a game is like finding the most effective pay out on an almost infinitely long multi-armed slot machine. You have so many variables to consider that a complete mathematical analysis of most games is impossible. Luckily there is one computer capable of solving these kinds of problems, and that's the one between your ears. But experience is a requirement here. You need to play lots of other games - especially games with depth to them - to appreciate how to balance your own.

8. Ignore (almost) all player feedback when it comes to 'too powerful' abilities

There are an incredible number of people who are prepared to label an ability in a game as being too powerful or cheap, without understanding the ramifications of that ability at high level play. So as a rule you should ignore them. The exceptions are those players who are at the top of their game. You are then permitted to listen respectfully, and then ignore their feedback unless there is other evidence to back up their anecdotes.

9. Your players will figure out the numbers so make the maths available to them

One of the most useful things the phenomenon of crowd sourcing has produced are endless wikis for games of detailled statistics of in game play. If your game becomes popular, the players will figure out the numbers, so you should make them available. Surprisingly, this happens infrequently outside of the strategy game genre - where tables of numbers are a typical characteristic of in game documentation.

So how do the above principles help when it comes to discussing the Pyro changes that Valve have made?

The Pyro class was one of the earliest to receive a class update from Valve - only the Medic preceeded it. But balancing the Pyro was immediately problematic. Within a few weeks of the class update, the new primary weapon the Backburner got its first nerf. Previously it had provided +50 health (and glorious were the days on the servers when the Pyro update came out). Then Valve added back some of the damage drop off that they had removed with the Pyro update to the Pyro's flamethrower.

Since then, the other updates have added various ways of putting out the Pyro's afterburn effect: the Sniper's Jarate, the Pyro's airpuff, the Heavy's sandvich, the Spy's dead ringer (although the cloaked spy can be immediately set on fire again). These have been both thematic improvements and Valve providing a number of reset options to the afterburn effect.

Discussion on the Pyro prior to the most recent changes on the forums has revolved around 2 main areas of contention: the Pyro still being underpowered, and the criticism of the W+M1 still of play of so-called noobs - those people who play a Pyro by running forward with the flamethrowing and not otherwise thinking. Neither of these suggest serious balance issues, and the criticism of 'cheapness' can especially be ignored.

Valve themselves have acknowledged issues with the Pyro at 'high-level' play. But Valve's statements to this effect have been contradictory in the past and may not necessarily point to the heart of the problem - they initially stated that the Pyro was intended for close combat, but the flamethrower has always had weak close in damage - now doubly so.

So is there any balance problems for the Pyro that we can establish empirically without anecdotal evidence or hersay? Looking at the information made available by Valve and third parties, I can see at least one imbalance, and one paradox, worth discussing further.

Firstly the choice between the Flamethrower and Backburner is heavily skewed in favour of the flamethrower. In fact, no other class has such a split between primary weapons (It is worth stressing that all the statistics on the main page are from players who only have both choices available). And how to rebalance? Balance up, of course. This suggests the backburner may need to be improved against the flamethrower some how.

More interestingly, the choice between Shotgun and Flaregun is almost a perfect split. So Valve's buff to allow the Flaregun to do minicrits against opponents on fire has been received well, which we'll use here as a proxy for the two choices being balanced against each other.

Finally the melee weapon selection is biased heavily against the default axe, and towards the Axtinguisher. With the Homewrecker, there is the possibility of novelty or bragging rights factor that may influence the frequency it appears in the short term. Again, we should balance the default Fireaxe up if there is a balance problem - but we don't have the statistics to suggest that this is the case.

From checking the preferred weapons we can see that no other class favours one weapon choice any less for the primary and melee weapons - the Soldier's Equalizer is a straight upgrade to the Pickaxe and so there is no reason to ever not use it if it is available. So we have a clear imbalance between some of the choices that the Pyro has available. This does not necessarily mean that there is not a situational advantage to the weapons which are least selected, but the situation advantage must occur rarely. As always, the suggestion is to balance the unfavoured choices up so that they become more preferred. But this is not itself an indication of whether the Pyro is a balanced class, and so we do not have evidence yet that the class has been nerfed.

Looking at the class distribution choices, the Pyro falls towards the bottom of the range of those chosen. But interestingly, despite being relatively unpopular, the flamethrower damage percentage of total damage inflicted is the second highest behind the grenade launcher. And it has the second average longest range of any weapon (including the Sniper Rifle) except for the minigun. But we know that the flamethrower damage output is incredibly low. There is definitely a paradox here, and one caused by the aggregation of flamethrower damage, and afterburn damage in the same statistics.

And this is the main problem with the publically available statistics. There's not enough detail to have a viable discussion about the Pyro 'nerf'. In fact, the statistics Valve releases aggregate all the weapon statistics together, so it is not even possible to say which weapon choice does more damage (You can see this from the Sniper SMG - actually the Jarate - being the fourth highest source of damage, and the Fireaxe - actually the Axtinguisher - having such a high critical rate). It is not clear from the statistics page but this appears to be a limitation of the way TF2 collects stats.

Without Valve coming out and providing more detailled information about the decision making in making the Pyro changes, the outrage of the forums will continue to boil and players will continue to come up with suggested buffs and nerfs without a deeper understanding of the process that went into making these decisions. Mathematical modelling of the Pyro damage drop off isn't enough to explain the paradox of the flamethrower having the second highest percentage damage output.

So my request is not to buff the Pyro back - it's to improve the communication process and statistics collection. That way we can have a much more useful discussion about the changes to one of my favourite classes.

For the record though: I'd like to see the Pyro back to what they were on the release day of their class update... just for a week, or as a mod. But my biggest problem isn't a Pyro's damage output, it is their longevity. They need something to survive a few seconds longer: perhaps the Backburner should trade increased protection against bullets for increased knockback (allowing an onrushing Pyro to be pushed back by a Heavy's gun) and the Homewrecker should reduce knockback when wielded.

For those of you interested in a more indepth discussion of balance in games, I recommend David Sirlin's excellent four part guide on game balance, as well as his website in general.

I've also written a follow up part two to this article.


The Mad Tinkerer said...

I was mad about the Pyro flamethrower nerf too until I realised that the Backburner was EXACTLY the same.

While the balance change may be slightly inconvenient if you don't have the Backburner yet, it's ultimately balanced because there's a great alternative if you don't care about airblast. Like me. (I always use Backburner, possibly because early on I figured out the tactics needed to use it effectively and don't need no stinking health buff.) I now see the change as encouraging ninja-style Pyros.

Also: I REALLY want a Homewrecker now. A melee weapon that makes you better at wrecking enemy buildings AND countering sappers? Frickin' SWEET.

The Mad Tinkerer said...

"This suggests the backburner may need to be improved against the flamethrower some how."

Only if you widen the area that counts for instant-crits: most other buffs will definitely unbalance it. (Currently, you need a more precise angle than a spy's knife, but at the same time you don't have to be nearly as close.) If you up the damage of the backburner it will make things way too easy. Trust me on this.

Andrew Doull said...

The Backburner has been nerfed: it now only has a 4 second afterburn.

But the balance between the Backburner and Flamethrower is better now, in that the backburner is now an effective close range anti-Pyro weapon - as I mentioned, I don't think the issue is with damage output.

The biggest change for me, surprisingly, is I almost never switch to the Axtinguisher any more. 4 seconds is just too little time for me to reliably get a weapon switch and one or two hits in. Especially since the most likely time I'll be using it is immediately following a medi-heavy burn, where I want to ideally to include a puff to separate the two.

I'm feeling a little happier about reflected strikes now. They are still useless at short range, but at mid range, particularly with about 2-4 soldiers/demos rushing you, you have a good chance to reflect a couple of strikes successfully.

I suggested improved resistance to bullets specifically to force a hard choice between using rockets and switching to shotgun for soldiers who haven't clearly seen which item the pyro is carrying. This also applies to Pyro shotgun use - Backburner would then be an all round stronger Pyro vs Pyro weapon.

The Mad Tinkerer said...

Yeah, actually the Axtinguisher is a problem: I have NEVER used it successfully in a real match (never managed to switch to it while someone was on fire, and never managed to actually kill anyone with it who wasn't trying to help me with achievements). I've only ever seen it used successfully in a real match when one Pyro was specifically igniting someone else for the sole purpose of their buddy using the Axtinguisher.

Which is why I really want the Homewrecker. Both of the other melee weapons are essentially useless given my play style (though I've used most other classes' melee weapons effectively). Oh well, I'm sure it'll pop up in a random drop soon.

Bantam said...

I strongly disagree that you should balance upwards, in a competitive game at least. Balancing towards the mean is a lot easier and a lot less work than moving balance towards the stronger end of the scale. 'Power creep' becomes a slippery slope whereby things move further and further away from the original vision.
Yes, nerfs are very unpopular, but making one nerf is less prone to unsettling overall balance and creating a new 'imba' than buffing a dozen other things to match it. Buffing has the same effect on relative balance for the strongest, 'most fun' class as judicious use of buffing and nerfing, yet comes with a host of problems.

In my opinion, one of the most important things about balance is making small, incremental changes rather than big sweeping edits to the overall balance.
I like balance changes that promote skilled play too. Gameplay mechanics that are difficult to pull off should be strong or flexible, giving choice to experienced players and helping to stratify the good and bad players.

In terms of the pyro balance changes, I never felt like it was a class that needed a nerf. The balance issue is obviously between the two weapons. I suspect the vast majority of people would always take the flamethrower because it gave you almost as much killing power as the backburner but also that additional utility. I personally think a backburner buff might have been the better choice in this case... maybe a tad more direct damage? I've always seen the pyro as a fun, rather than competitive class so I'd agree that in this case nerfs were unneeded.

nihilocrat said...

That Starcraft 2 article makes me feel simeltaneously hopeful and sad. Hopeful that Blizzard won't make as much of a micro-oriented game. Sad that people actually LIKE micro and throw a fit when it's made weaker. Go play an FPS if you want an action game, jesus. Get the fuck out of my strategy.