Thursday, January 25, 2018

Howdy everyone, Thomas Blixtev Blair here for another episode of “Design Mumbo Jumbo!"

Today we are going to focus on RANDOMNESS – and the unfortunate side effect that we call STREAKING (no, no, not that kind of streaking! I’m talking about ‘streaks’ of good or bad luck, where an assumed series of random numbers comes out decidedly less-than-random.). On this journey we’ll also touch on operant conditioning, and then preview of some crafting changes coming soon in Crowfall®!

Let's dive in!

Randomness as a concept has existed for thousands of years, even though in early times it was ascribed to supernatural forces like divine favor, fate, or fortune. It wasn't until the last two centuries that a more formal analysis, across multiple scientific fields, has arisen to quantify and mathematically describe the concept of randomness. The part we are most interested in, as it relates to video games, is the mathematical theory of probability; the statistical descriptions of chance events, mainly in the context of gambling.

As gamers we are pretty familiar with probability. Consider the word "chance", and most of you will immediately think about a random die roll, one of history’s earliest tools for harnessing the power of random chance. In RPGs, the most notable systems that rely on chance are loot drops and combat rolls (accuracy, damage, critical hit chance, you name it). Any time you see the word "chance", somewhere behind the scenes a “random” number is being cooked up by some obscure mathematical equation, and that number is checked against some threshold or value to determine success or failure (And somewhere across the world, that success or failure is leading one or more gamers to shout "RNG SUCKS!!!" because they don’t like the result.).

In the early days of roleplaying games, random chance was embraced quite heavily. Early online games (MUDs) would roll up attributes for incoming characters (like Strength, Dexterity, etc) and hide these from the player for dozens of hours. They were hugely impactful in terms of gameplay, but completely random in terms of which player “won” and which player “lost”.

In recent years, this randomness has largely fallen out of favor as it removes an element of player skill and replaces it with something that the player, by definition, cannot control. When players talk about “RNG”, they are referring to the random-number-generation algorithms which, in addition to being fickle, are also prone to “streaks” of the same or similar numbers. In the best of cases, we don't notice the steady stream of critical hits or high-quality loot we receive. In the worst, rare loot never seems to drop for us, or we have multiple crafting failures in a row and we find ourselves temped to rage quit a game.

Under the covers, though, cogs are spinning and wheels are turning. There are reasons why we have a love/hate relationship with random number generators. Let's look at some of them!

Gambler's Fallacy is the belief that if something happens more frequently than normal during a period of time, it will happen less frequently in the future; or that if something happens less frequently than normal during a period of time, it will happen more frequently in the future. This is most often seen with monster loot tables. A .5% ultra rare mount is a .5% RNG roll Every. Single. Time. that loot table is rolled. It doesn’t matter if it is the first time being rolled or the 50th. The chance remains the same, the RNG has no knowledge of the previous rolls and just keeps rolling the dice from 0-100 and if you don't roll under .5, there is no reward.

But our brains don’t think that way. It is not uncommon to hear someone exclaim how much more likely the mount is to drop on their next attempt. Or that they are on a bad luck streak and the streak just has to break eventually.

Unfortunately, though, this just isn't true. It may make someone feel better (especially on their 50th attempt), but in actuality every loot roll is an independent event. It takes absolutely zero of the prior rolls into consideration.

To make matters worse, the human brain can be selective (and by that, I mean “bad”) when it comes to recalling events. One bad roll of the dice, at a really bad time, carries more weight in our memory than 100 good rolls that you didn’t particularly care about.

Here’s an example: the assembly process of crafting in Crowfall. There is a die roll that happens when the player assembles the item; the values used in this roll are based on the difficulty to assemble that item versus the player's stat/skill level.

Pre-5.4 example:
Let’s say a crafter is doing the final combine of a sword.
• The difficulty is 40.
• Crafter has 15 Blacksmithing Assembly stat.
• 40 difficulty versus 15 skill = There is a 75% chance to succeed at the assembly phase.

Trying to make an item with a difficulty of 40 with 15 Assembly stat sounds risky, but not too risky. Most people would call a 3-out-of-4 chance of success “pretty good odds.” But that’s not the whole picture! Context matters. This is a very important combine because the effort to get to a final advanced weapon combine could take a few hours of gathering. Since this is the most important combine, success or failure here will carry a lot more emotional weight than the dozens of combines that might have led up to this point. Remember, with a 1 in 4 chance of failure, there is a pretty reasonable chance of failing this roll multiple times, or even multiple times in a row.

Hitting multiple fails in a row, in weighty situation like this one, could lead someone to get really frustrated at the game. If one is suffering from Gambler's Fallacy, where they decide the streaky nature of the random generation is unfair, they might even rage quit.

That being said, the randomness isn’t necessarily to blame – it’s also the penalty associated with that bad roll. The root of the problem in this case stems from an early goal of the crafting system: we wanted to limit the number of recipes a player had to prevent the overwhelming “recipe bloat” problem that is so common in the genre (i.e. Instead of a copper helm, tin helm, iron helm, etc.. Crowflall just grants the player a customizable "helm" recipe).

Our solution is great at solving the problem at hand: one recipe can result in a number of different items, which means that the system is much deeper than the (still huge) list of recipes would indicate. The downside, however, is that we only have that one recipe to give the player progression against. To use this recipe across multiple tiers of items, we need a difficulty knob to enable advancement. We want some items to be too hard for beginners, and others too easy for advanced players.

Obviously, though, something is off in the risk/reward spectrum. We’re addressing this in a few ways, the first of which is baseline difficulty.

One step in the right direction relates to the baseline difficulty. As part of the 5.4 update the quality of resources that the crafter uses will now be part of the difficulty calculation. If poor quality resources are used, the baseline difficulty value for that recipe is used. If common, uncommon, rare, epic, or legendary resources are used at all, an additional difficulty amount is added to the combine. This allows us to lower the baseline difficulty across the board and make it easier for newer crafters to succeed when using lower-tier resources (which will be the case more often than not).

For Example:
Let’s say a crafter is doing the final combine of a sword.
The baseline difficulty is now 25.
The best resource quality used is poor, 0 modification to difficulty.
The crafter has 15 in the Blacksmithing Assembly stat.
25-15 = There is a 90% chance to succeed at the assembly phase.

If the top resource is rare, add 30 to the assembly difficulty.

55-15 = There is a 60% chance to succeed at the assembly phase.

This reinforces the need to invest in crafting, as people who only dabble in crafting will be limited to poor/low quality version of these items and higher skills will be required to work with the best resources.

Another factor to consider is operant conditioning. This is a learning process through which the strength of a behavior is modified by reward or punishment. Most folks have heard of B. F. Skinner and his studies in operant conditioning. Skinner rewarded the actions of his subjects (mainly rats and pigeons) with food, or punished them with loud noises. He performed most of his experiments in a device known as a Skinner box.

Of the many aspects of Skinner’s research, we are going to focus on the schedules of reinforcement. Schedules of reinforcement are simply the rules that control the delivery of reinforcement mechanisms. These rules specify either the time that reinforcement is to be made available, the number of responses to be made, or both. Basically, this equates to how often (or consistently) a reward is given out for doing something. There are many combinations of rules but of interest to us are fixed ratio schedules and variable ratio schedules. (Crowfall also makes use of Fixed Interval Schedules in the passive skill systems, but that is a topic for another day).

A fixed ratio schedule is pretty much what it sounds like: do X number of actions and get a reward. The reward must come every single time you perform the required number of actions. For example, every 20 enemies killed the player gains a cash bounty. While it may be nice to know that you will receive a reward in two kills, it is less nice when you are at two of 20 towards the goal. That is the downside to this type of reward schedule: the player incentive drops significantly soon after a reward is given, especially if the number of actions is lengthy.

A variable ratio schedule operates very similarly to a fixed schedule except the reward is not guaranteed as a result of performing the action. Instead, it is given at random (Ahhh, our old friend the RNG is back!). To use our prior example, assume that the cash bounty is sometimes granted after killing three players, and other times after 47. With a randomized reward schedule, the player never knows exactly when the reward will come, so their incentive to perform the action is always very high as a reward is always potentially one action away. The downside is, of course, that (as mentioned earlier) random number generation can be streaky. The player can (and often does) get a streak of poor rolls. The next player in line might get a great roll, and get the reward on the first try… but that doesn’t matter to the poor S.O.B. who gets a bad run. Generally, over the long haul, the values will average out. However, if the player has had suffered from many random rolls that produce nothing, it can feel like something is broken.

Designers will use these schedule types to encourage various behaviors across a games’ design architecture. Unfortunately, though, using the “average” to see if a system is working can cause you to miss the outliers (i.e. the guy with the bad streak). So how can you take advantage of a randomized reward schedule but avoid the “bad streak” outliers?

Let’s build an example: say a player is in a bad streak. They have a 50% chance to receive a reward for an action, yet they have failed 10 times in a row! This can, and does, happen all the time in systems that rely on a random roll. It’s statistically unlikely, but with enough players going through the path, it’s definitely going to hit a some of them… in a linear correlation, more players means more outliers.

There are a few ways to handle this. When building DC Universe Online (and when Todd was building Wizard101), we both employed a “failsafe” mechanic where we tracked every time a player had a “failed” roll for a quest objective. For every failed roll we added an additional 10% chance for them to succeed the next time they did that same roll. In most cases this meant that when performing a roll for a quest objective, the player never had to perform the same roll more than 3-4 times before succeeding.

Raph Koster (the design lead for Ultima Online and Star Wars Galaxies, as well as one of our formal advisers) often proposes another solution: deck systems. In a system like this, each “roll of the dice” is not treated as an independent event. Instead, all of the potential outcomes are shuffled together, like a hand of cards, and cards (results) are removed from the deck as they are used. As each card is removed from the deck, the probability of specific cards in future draws shift along with it, thus ensuring that streaks can’t last.

Here’s an example of this system in action. Let's start with a “deck” of 10 potential results/cards, five of which are failure cases. On the first draw, the player has a 1 in 10 chance of drawing any particular card; a 10% chance for a specific outcome. Assuming the player failed on the next draw the player now has a 1 in 9 chance (44%) of drawing any particular card and one failure card has been removed from the deck.

We don’t need to dig into the math any more to illustrate the point: each failure makes the chance of the next roll more likely to be a success instead of another failure.

Either of the above two solutions will work; we’re likely to use the card draw system where we can because we have more control over the player experience. The key is that we want to protect the player from extremely streaky negative results (For what it's worth, we tend not worry about positive streaks. Positive streaks don’t carry the frustration, and in many cases the player stops trying an activity after they achieve the result they are looking for, anyway.).

As promised earlier, this seems like a good place to talk about other changes to the assembly phase of crafting. Now that the Sacrifice system is implemented (allowing players to sacrifice items to the Gods in exchange for vessel XP), we can remove the destruction of all resources and items whenever the crafter has a failed crafting roll. Originally this was meant to act as both a penalty for failing a roll and as an item sink for the economy. Since we don’t need the sink anymore, we’re left with only the need to deal with failed crafting rolls… and in that case, as I said earlier, the “lost everything on failure” design was throwing off the whole risk/reward calculation.

And since we’re doing that, it's also a perfect time to rework the way the outcomes from the assembly phase are named.

Previously a player had two outcomes from the assembly phase: failure or success.

In the 5.4 update, crafting the assembly phase will now have 3 results:
Amazing Assembly – Improves the end product to the top level of quality of resource used in the combine, and moves the crafter to the experimentation phase.
Succesful Assembly – Assembles the object and moves the crafter to the experimentation phase.
Flawed Assembly – Assembles the object at one quality level below what the result would have been and skips the experimentation phase.

Between this change and the one mentioned above, we think we’ll end up with a system that feel much more rewarding, but still allows us to scale difficulty and advancement based on investment in crafting as a mainline profession.

Adding a “random” factor to a game can be beneficial, but it’s also more dangerous than you might think. I hope you enjoyed walking through the weeds with me and getting some insight into some of the reasons why game designers make systems in a variety of ways (as well as why people playing a game behave the way they do).

I’m excited for you guys to see these changes in 5.4 (which is right around corner!) and I’d love to hear your thoughts and feedback on how this affects the overall loop of harvesting, crafting, and the ingame economy!

As always, please feel free to leave comments on the forums, and we’ll see you in game!

Thomas Blair
Design Lead
ArtCraft Entertainment

need help??