The Game Design Forum

Part Two: Randomness in Diablo 2

Students enter game design programs with many prejudices about game design which need to be amended. One of the most common prejudices I have encountered among students is the notion that all randomness in game design is more or less the same. This idea springs from a lack of critical examination. Often, I’ve asked students about these small fireballs which periodically fly through the screen in Super Mario World.

Students tend to say that these fireballs come at random. I then ask, “So they can come at any time or not come at all?” All students quickly figure out that this isn’t the case. Even brief observation reveals that the fireballs come at a regular rate. I then ask more questions about fireball speed, fireball quantity, etc. Eventually, the students figure out that certain things are true of the fireball and the randomness to which it is subject. The truth is that only fireball position is variable. The quantity, frequency, speed and size of the fireballs are not random at all. Indeed, even their position is restricted to about half the screen. With a little bit of critical examination, young designers learn that randomness in game design is actually much more nuanced than a cursory glance might tell them. This statement is especially true of Diablo 2. Not only is the use of randomness nuanced and complex, there are also many different kinds of randomness built into the systems which make up the game. This part of the book examines the different kinds of randomness built into the systems of Diablo 2.

Types of Randomness

Diablo 2 makes for a great lesson in the various kinds of randomness that exist in game design because it employs so many of them. Indeed, not only does Diablo 2 exhibit many kinds of randomness, but it also causes many of these random systems to interact with one another in complex ways. By studying Diablo 2, we can learn a lot, generally, about randomness in game design. The first step in this study, naturally, is to identify all of the kinds of randomness that exist in the game and explain how they work. One proviso I want to offer before I begin, however, is that I am only going to offer game-specific definitions of the various kinds of randomness. I am certain that there are mathematics-specific or science-specific definitions of the things I am about to describe. I don’t know them and I don’t need to because I’m not really trying to define widely-applicable statistical concepts. For the purposes of this book, I’m only interested in creating some simple frameworks for understanding randomness in game design. Many readers who have statistical training might object to my characterizations of randomness as being too simple. Moreover, many of the things I describe are just different ways of looking at the same basic concept. All of those objections are totally true! But many of the descriptions that follow are still useful for thinking about videogame design, and hopefully do not distort or dissemble about the deeper concepts they represent.

Randomness within a Range

The most obvious and most traditional type of randomness in Diablo 2 is what I call randomness within a range, or “RR” for short. Randomness within a range is what most people have in mind when they think about randomness. For example, if you ask someone to think of a random number, they’ll probably think of an integer between one and ten, or maybe one and 100. That’s an assumption English speakers tend to share. A truly random number could be anything from seven to pi squared to 41,286,317.91—or literally anything else! Like most people, however, Diablo uses randomness within a certain range to control the number of possibilities.

Diablo 2 uses RR copiously: the damage done by almost every attack made by a player character or enemy exists in RR format in more than one way. This has a profound effect on the way that Diablo 2 operates because of the difference between how people perceive randomness and how the game displays it. For an RPG, Diablo 2 is very fast and very intense, and players and enemies can easily use thousands of attacks per hour. Thus, the player’s perception of the power of an attack depends on the average damage of that attack (or set of attacks). When adopting a new attack or weapon, players will notice if the average damage is significantly higher very quickly as they chew through hundreds of enemies at a faster rate. But large increases (or decreases) in damage are not the problem—the problem lies in marginal increases and the way that the game’s UI displays RR-format damage.

The character screen displays damage in RR format rather than in average format. It’s not hard to calculate the average of two numbers, but Diablo 2 isn’t the kind of game where players are meant to keep a scratchpad or calculator handy. Some RPGs are like that. In 5th Edition D&D, each round of combat takes six in-game seconds, but probably takes 90 seconds to five minutes to actually play out in real time, depending on how many players there are. There’s plenty of time during those rounds to do a little math and sort out which weapon or skill to use. If Diablo 2 were played that way, players would never make it out of the first dungeon—there are just too many enemies to fight and too many weapons and attacks to choose from.

How big of a problem does the lack of an average figure present to players who aren’t slowing down the game to do the basic math when presented with alternatives? Let’s say a Diablo 2 player is presented with two magic swords to pick from. The swords have the exact same base stats, but they have different magical modifiers. The first sword has a magical modifier that gives it +10 to maximum damage. The second sword has a magical modifier that gives it +10 to minimum damage. Which one is better? Over a long enough sample, neither one is better, although minimum damage increases are significantly rarer as affixes. The only tactical advantage for higher minimum damage is when hunting a very specific sort of monster. If the player knows the average HP of an enemy he or she is hunting, and can get the minimum damage of the character’s weapon or attack above it, minimum damage is a little more valuable. Overkilling enemies doesn’t help. That said, there’s no long-term reason why there should be separate benefits to maximum and minimum damage at all. Weapon affixes that augment minimum and maximum damage aren’t a huge part of the game; most of the high-end equipment offers a percentage-based increase. But the principle is still important, because when players select between different skill-based attacks, they get information (and decisions to make based on it) that is just not that useful to them, most of the time.

RR VARIATION AND BATTLEFIELD DYNAMICS

The slightly murky nature of randomness within a range is not without its merits, however, even when it’s at its murkiest. Monster HP also exists within a random range of values, and this range creates some interesting action effects. The Fallen has an average of 209 HP on nightmare difficulty,[33] but it can have anywhere from 131 to 288 HP as well. Because each member of a pack can have different HP, these random variations can have some interesting results when a player uses a multi-target attack. Below I have visualized a group of Fallen before and after a multi-target attack like Glacial Spike.

What’s happened here is that, as the sorceress casts her spells, the monsters begin to close in on her. Some of the monsters with lower HP die, but their HP is random, meaning that the remaining pack of higher-HP monsters takes on an irregular shape. This shape then gives the player a tactical new choice of where to run in order to launch the next few spells. This is one of the better examples of how randomness in RPG systems can lead to action-game consequences. The player isn’t aware of this (because the random damage and random enemy HP are not displayed numerically), but it nevertheless succeeds at forcing the player to make tactical choices in real time.

BUT THIS ONE GOES UP TO 11

True to roguelike orthodoxy, but often quite frustrating to mainstream audiences, Diablo 2’s best weapons and armor all exhibit RR variation in the affixes they acquire. The best one-handed sword in the game (Azurewrath), for example, can have an increased damage affix of anywhere from 230% to 270%. This can give fits to master-level players who spend hundreds of hours trying to find any version of the item, but who all want the perfect version. Anyone doing the math on the variation will notice, however, that the RR variation can only change Azurewrath’s DPS by about 30 damage, or less than two percent. Not all items are like this. Another sword called The Grandfather, for example, experiences a 100 percentage-point RR variation, and accordingly it can see its DPS change by about 23% based on the stats it drops with.[34] The damage modifiers on most weapons vary less than 50 percentage points, however, and even that level is uncommon except on elite, unique two-handed weapons.

The bigger problem is in the smaller ranges. Items like the unique helm Valkyrie Wing offer affixes like +1-2 to amazon skills. At the highest levels, it’s easy for the player to make up for the loss of 30 DPS, but +skill affixes are as valuable at level 5 as they are at level 85. (Some top-end melee builds do not need such affixes at all, but most classes benefit from them greatly.) Ondal’s Wisdom, a late game staff, is always one of the most valuable caster weapons in the game. It offers 45% faster cast rate, 40-50 energy, and 5% bonus experience; it’s a tremendous item. But it also offers +2-4 to all skills.[35] The high end of that range is worth a lot more than the low end, and players will rightly be frustrated if they get a two rather than a three or a four. We’ll see in part three of this book how those affixes change the overall value of an item, but it suffices to say for now that small RR variation in very important stats is far more important than large-range variation in DPS. This doesn’t mean that players will throw away unique items that don’t have perfect stats. Unique items are rare enough that players can’t throw high-level items away crassly. Unique items also tend to be powerful in several different ways, and some of those ways are bound to be useful to the player. We’ll see a lot more about this aspect of unique items in part three.

RR AND MAP GENERATION

Although its implementation is not nearly as obvious as elsewhere, Diablo 2 also employs RR variation in map construction. The procedural generation of maps is one place where the otherwise thorough documentation of Diablo 2 falls short. (This is probably because the most advanced players of Diablo 2 employ map-hacks, which reveal the entire map and thereby obviate the need for a comprehensive understanding of map generation.) Moreover, the Diablo creators I interviewed were not able to recall every step of their methods perfectly. They did provide me with quite a bit of info on the general process, however. This process will come up again in the “deck of cards” randomness section, but I mention it here because there is a bit of RR variation in it too. The overall process is as follows:

  1. The overall size of the map is determined
  2. The orientation of the map is determined
  3. The zone created is wrapped in a containing wall, which has its own generation algorithm
  4. The major feature of the zone is placed
  5. Rooms and/or doodad objects like trees or houses are added to fill the space (deck of cards randomness, which we’ll examine later)
  6. Enemies are added based on density limits (also deck of cards randomness)

[36][37][38]

It’s not immediately clear how some of these steps are governed by RR, but they are. Map size, for example, is always calculated within a pre-set range. The X and Y axes of a map have minimums and maximums, and the total area of each zone also has a cap. The specifics of each zone aren’t terribly relevant, but some zones are meant to be bigger than others. Acts one and two have large, rectangular outdoor zones punctuated by dungeons which are narrow and more corridor-like. Act three, by contrast, has long, snaking outdoor zones which tend to follow the path of an impossibly meandering river. There’s not a predetermined length, but rather an RR guideline which determines the map length.

In the fourth step of the map generation process, the algorithm places a zone’s main feature. Usually this feature is the entrance to a dungeon, like the Cairn Stones in Stony Field, or a portal into a demonic realm in the Act five maps. Sometimes the object is simply a quest objective too, like the Tree of Inifus in the Dark Wood. But regardless, the designers made some guidelines to prevent key map objects from colliding with other objects that would prevent access.

In most cases, this object cannot be placed next to the edge of the zone. To accomplish this, the designers employed a range that looks something like this:

OBJECT LOCATION

XCoordinate = (X-Axis + 20 yards) to (X-AxisMaximum - 20 yards)

YCoordinate = (Y-Axis + 20 yards) to (Y-AxisMaximum - 20 yards)

(Nobody had the exact formulas, but this example reflects the philosophy which the designers did remember.) Thus, the most important object in the zone is algorithmically separated from being next to a large barrier object which might block or glitch it.

KEY LESSONS ABOUT RANDOMNESS IN A RANGE

  • The most important thing to know about RR variation is that players will perceive the average of your range, even if they don’t know what that average is. Just tell players what the average is
  • Small ranges (1-2) are more volatile than large ranges (1-100) when you’re dealing with important stats like DPS or skill modifiers—but large ranges on items can leave some players bitter over significant reductions in power
  • RR variation is a great way to set bounds on objects in the map without setting a fixed location

Slice-of-Pie Randomness

Slice-of-pie (SOP) randomness is the kind of randomness that occurs when there are several random options, any of which could be the outcome of a given roll. Rather than give an example from everyday life, I’m going to dive right into the Diablo-specific applications of it. This type of randomness governs most of the steps in loot generation, including the first step. When an enemy dies, the game looks at that monster’s level and then pulls from a list of items of that level or lower. This means that the monster could drop a level 35 item, a level 6 item, or nothing at all—but not with equal probability.

Each slice of the pie is a different size. Why use a pie-shaped visualization, though? There are three properties of this kind of randomness which are best understood when the entire process is viewed as selecting one piece of a pie. The first property is that, when the game selects a piece of the pie, there’s no null result. One of the slices must be taken (even though one of them is “no drop”), and this is true of almost every enemy death in the game. The second property of this kind of randomness is that the more slices of pie there are, the lower the odds are that the player will get any one of them. For example, if we add more options for the game to select from when loot is dropping, the chances of selecting any one of them must go down.

This becomes an important dynamic in the late game, because higher-level enemies don’t just drop high-level loot, they can also drop most of the loot below their level, too. Or, in other words, the Diablo 2 designers are always slicing the pie into smaller and smaller pieces, rather than simply changing the contents of each piece.

The third principle that makes the SOP randomness metaphor useful is that, when a slice of the pie grows or shrinks, other pieces of pie have to change as well. Why is this important to know? In Diablo 2, the player has the power to change the sizes of the slices of pie. One of the most famous mechanics in Diablo 2 is its magic-find system, through which the player can improve his or her odds of getting better items. Not only can the player make the size of the “rare” and “unique” pieces of pie, but by doing so they actually make the “normal” item slices (which are often useless) much smaller. We’ll cover that portion of the system in detail, but it’s also important to know that it’s not the only place where the player can influence the size or number of slices. Indeed, the player can also change how many items drop, too, although the process requires other players and has some drawbacks.

CHANGING THE SIZE AND NUMBER OF SLICES

The first thing that happens when items are being generated from corpses in Diablo 2 is a check that determines if any items drop at all. This is one of the easier places to start looking at how Diablo 2 generates items (and implements SOP randomness). When an enemy dies, the game rolls to see if an item drops. The no-drop slice of the pie is based on an area-wide variable rather than something in the monster itself. When the player is alone, the standard rate of “no drops” is about 62%. As more players enter the game and/or area where the current player is fighting, the no-drop rate decreases.

[39]

Nothing within the game’s UI explains that a greater number of players will see greater amounts of loot, although it makes intuitive sense. There needs to be more loot when there are more players so that everyone can have some! And, just as players will notice that having more players around results in greater amounts of EXP filling up their level bar, players will notice the greater amount of loot through what they see on the screen. Diablo 2 is infamous for all of its “junk” loot that nobody wants—low-level common items or broken versions of gear. Even though this gear is almost always useless, it works well as an indicator of the current local drop rate.

The no-drop calculation is only a part of the first step, and there are several SOP rolls which lead the game to its choice of item. One of the important properties of SOP randomness is that, as one slice of the pie shrinks, the others have to grow. So what is growing when the no-drop slice shrinks? The other selections in the first step are large meta-tables—tables which contain other tables. In his excellent guide on magic find mechanics, Fendriradramelk gives this example of meta-tables which can drop from enemies. When a player kills a skeleton on hell difficulty, it can drop the following.

[40]

The meta-table descriptions are fairly accurate to what they contain. “Act 1 (H) Junk” drops junk and “Act 1 (H) Good” will drop fairly good items, relative to the level of the monster that drops them. “Act 1 (H) Equip A” is the one that players are really looking for, however; it has the useful gear that players need to improve their character. Unfortunately, the player has no influence over which one of these slices the game randomly chooses. More characters means a greater chance of loot dropping, but the newest and best loot doesn’t replace the no-drop slice of pie at a 1:1 ratio.

As the monsters increase in level, new slices are added to the pie in the form of new item tables, but this actually causes quite a few problems for players who are hunting for specific pieces of gear. As the second property of SOP randomness states, every new slice makes every other slice smaller. Moreover, the newest items are always added as the smallest slices. This is intentional; high-end items become extremely powerful on the later difficulties, and the designers have to maintain the game balance and give the players something to work towards. But there are other bottlenecks in the SOP randomness method that appear to be unintentional. These problems tend to arise from the unequal distribution of items in different treasure classes. Below is a pie graph which displays just the set and unique items of treasure class (TC) 30, which drops from enemies around level 40.

[41]

Once the item selection process reaches this step, there’s a reasonable chance that a player will get the item they want, especially after multiple runs. But look at TC 51:

[42]

The abundance of treasure in this treasure class means that even if the player manages to get a roll which selects TC 51, the likelihood of getting the actual item they want is significantly lower because there are so many slices of pie. This isn’t the most common problem, but it does affect the player’s chances of finding certain items in a way that probably wasn’t intentional. The designers definitely wanted to minimize the drops of elite gear and did so successfully. But whether or not they wanted to minimize the chances of any one item in TC 51 is doubtful, and probably an oversight.

ECONOMY CRUNCH

This problem of too many slices is magnified at the highest levels. As the game adds more and more slices to the pie of TCs that can drop, the chances of getting any individual piece of gear drop precipitously. For mid and lower-level gear which still has high-level utility (some of which we’ll see in part three), the player can eliminate some of these slices by targeting lower-level monsters. But for elite gear, the player has no other option than to target monsters with the greatest number of slices in the hope that the elite slice will be chosen. To get a sense of how much less these items drop than items in earlier tiers, I have created a visualization of the drop rates of each successive TC.

[43]

What you're seeing above is the rate at which a TC drops from the monster most likely to drop it. For example, Griswold is the monster most likely to drop TC 6, and he drops an item from it roughly 25% of the time per pick when he dies on normal difficulty. Baal is the monster most likely to drop TC 87, but he only drops it 0.1% of the times he dies. (Note that I have multiplied by the standard number of drops that a monster has, which affects the final odds. Normal monsters have one or two drop slots, but boss monsters tend to drop four or more items if the drop slots don’t have a no-drop result. Baal isn't the most likely to drop TC 87 per drop slot, but he is the most likely to drop it because he has so many slots.)

These drop rates mean that a player will only see, on average, one TC 87 item per one thousand times they kill Baal. And that's assuming that every single one of those Baal runs is completely full so that the no-drop rate is negligible—or else the odds are even worse! (Note that these are average odds. In my simulations of Baal runs, I routinely ran into stretches where a TC 87 item did not drop for two or three thousand kills in a row—something plenty of players have experienced.) But if every game has to be full in order to achieve these odds, then there are going to be eight players competing for the same loot. Diablo 3 famously solved this problem by generating loot for each player separately, but we're studying the game that the designers made, not the one they wish they made. The best items in Diablo 2 are rare, and players compete fiendishly for them. For his part, David Brevik said that he thought many items were a little too rare, especially the highest TCs and the rarest runes. He said he actually appreciated some of the events which Diablo 3 employed to increase drop rates for a limited time and/or under certain conditions.[44] In spite of all this, it's not as though Diablo 2 is unplayable because of this extreme difficulty. In section three of this book, we’re going to see how even moderately powerful gear is still very useful, especially when part of a set. But we’ve still got a few forms of randomness to examine before we get there.

MAGIC FIND AND SOP RANDOMNESS

The last step that Diablo 2 performs when generating items is to determine the quality of the item—whether the item is normal, magic, rare, set or unique. This is essentially the last pie in a long line of pies, but it’s also the one the player has the most control over. The largest slice of the pie is normal items, but every enemy has an inherent chance of dropping a magic item. When the player equips gear with "% Better Chance of Getting Magic Item," they augment the drop rate from those monsters. There are two important things to know about this augmentation, as they relate to SOP randomness. First, all of the gains made are at the expense of the “normal items” slice. This is pretty obvious, because there’s nothing else to take from.

[45][46]

The tricky thing is that the best slices—set and unique items—grow much more slowly than the magic and rare slices. What’s more, they receive diminishing benefits from magic find bonuses as the amount grows. In one way, this limits the amount of control the player has over the process; they can’t ever really reach a point where unique or set items will always drop. But in another way, the diminishing returns make the game more interesting in a way totally consistent with all the other stats. Attack speed, attack rating, casting speed, and move speed all incur significant diminishing returns as well. The purpose of that dynamic is to make the player take a more balanced approach to their character’s stats. One of the things we’ll see in part three of this book is that players have more control over the rate at which they kill monsters than they do the amount of reward they get for doing so. Magic find helps a lot, and players should seek it out. But at the same time, they need to make sure their character is a worthy warrior.

KEY LESSONS ABOUT SLICE-OF-PIE RANDOMNESS

  • Be careful about ever-expanding drop tables. If you keep adding slices to the pie, it will be less and less likely that the player will ever see the smallest slices.
  • Diablo 2 gives the player the ability to directly and indirectly augment the size of slices (through magic find affixes and party sizes, respectively), and this was famously successful.
  • The game doesn’t give the player the power to add or subtract slices, however. This may have been the problem that led to some items being too rare.

Deck-of-Cards Randomness

The next two types of randomness overlap a lot, but there are some critical differences. The one I’m going to discuss here is what I call “deck-of-cards” (DOC) randomness. Almost everyone who can read English has encountered the 52-card deck, but even if you haven’t, almost all card games operate by this principle. The most important quality of the 52-card deck—and really a deck in most card games—is that it contains repeats. There are four jacks, four sevens, etc. Thus, players have a 7.6% chance of getting any given type of card. But the repeats are limited, so when the jack of diamonds enters play, the chances that any other player will get a jack from the deck drops slightly. Professional card players have to be able to do this math in their heads in order to win consistently. There’s a second important quality to DOC randomness that comes in at this point, however. Depending on the game that’s being played, the deck is shuffled (randomized) at different intervals. In blackjack, there are often several decks being used, and professional card players need to know when a new (randomized) deck is being used, or when all the decks are re-randomized in order to win. The third important characteristic of DOC randomness is that the cards in it are interdependent in certain ways. In solitaire, for example, the player can only build columns of cards based on numerical order.

The part of the game which depends most on DOC randomness is map generation. Just above we looked at how important objects on the map are placed by RR< methods, but most of the work is done out of a deck of cards. The cards themselves are what level designer Stefan Scandizzo called “presets.” Essentially, the computer draws from a deck of pre-made level chunks and fits them together. We’ll examine this process through the lens of the DOC characteristics. Note that almost all the information below comes from my interview with Scandizzo, although the terms I use for randomness are original to this document.[47]

Want to read more? The rest of this section can be found in the print and eBook versions.

Action RPG History and Diablo 2 - Back | Next - Schaefer Variation and Acceleration Flow

Site questions? webmaster@thegamedesignforum.com

All material copyright by The Game Design Forum 2017