## A Comedy of Error – Part I

In the 5.4.2 Rotation Analysis post, I mentioned that I was looking into some odd behavior in the SimC error statistics:

I’m actually doing a little statistical analysis on SimC results right now to investigate some deviations from this prediction, but that’s enough material for another blog post, so I won’t go into more detail yet. What it means for us, though, is that in practice I’ve found that when you run the sim for a large number of iterations (i.e. 50k or more) the reported confidence interval tends to be a little narrower than the observed confidence interval you get by calculating it from the data.So for example, at 250k iterations we regularly get a DPS Error of approximately 40. In theory that means we feel pretty confident that the DPS we found is within +/-40 of the true value. In practice, it might be closer to +/- 100 or so.

Over the past two weeks, I’ve been running a bunch of experiments to try to track down and correct the source of this effect. The good news is that with the help of two other SimC devs, we’ve fixed it, and future rotation analysis posts will be much more accurate as a result.

But before we discuss the solution, we have to identify the problem. And to do that, we need a little bit of statistics. I find that most people’s understanding of statistical error is, humorously enough, rather erroneous. So in the interest of improving the level of discourse, let’s take a few minute and talk about exactly what it means to measure or report “error.”

Disclaimer: While I’m 99.9% sure everything in this post is accurate, keep in mind that I am not a statistician. I just play one on the internet to do math about video games (and in real life to analyze experimental results). If I’ve made an error or misspoken, please point it out in the comments!

Lies, Damn Lies, and Statistics

Let’s start out with a thought experiment. If we’re given a pair of standard 6-sided dice, what’s the probability of rolling a seven?

There’s a number of ways to solve this problem, but the simplest is probably to do some basic math. Each die has 6 sides, so there are 6 x 6 = 36 possible combinations. Out of those combinations, how many give us a sum of seven? Well, there are three ways to do that with the numbers one through six: 1+6, 2+5, and 3+4. However, we have two dice, so either one could contribute the “1” in 1+6. If we decide on a convention of reporting the rolls in the format (die #1)+(die #2), then we could also have 4+3, 5+2, and 6+1. So that’s six total ways to roll a seven with a pair of dice, out of thirty-six possible combinations; our probability of rolling a seven is 6/36=1/6=0.1667, or 16.67%.

We could ask this same question for any other possible outcome, like 2, 5, 9, or 11. If we did that for every possible outcome (anything from 2 to 12), and then plotted the results, it would look like this:

The probability distribution that describes the results of rolling two six-sided dies.

This gives a visual interpretation of the numbers. It’s clear from the plot that an 8 is less likely than a 7 (as it turns out, there are only five ways to roll an 8) and that rolling a 9 is even less likely (four ways) and that rolling a 2 or 12 is the least likely (one way each). What we have here is the probability distribution of the experiment. It tells us that on any given roll of the dice there’s a ~2.78% chance of rolling a 2 or 12, a 5.56% chance of rolling a 3 or 11, and so on.

Now let’s talk about two terms you’ve probably heard before: mean and standard deviation. These terms show up a lot in the discussion of error, so making sure we have a clear definition of them is a good foundation on which to build the discussion. The mean and the standard deviation describe a probability distribution, but provide slightly different information about that distribution.

The mean tells us about the center of the distribution. You’re probably more familiar with it by another name: the average.  Though both of those names are a bit ambiguous. “Average” can refer to several different metrics, though it’s most commonly used to refer to the arithmetic mean. “Mean” is used slightly differently in different areas of math, but when we’re talking about statistics it’s used synonymously with the term “expected value.” The Greek letter $\mu$ is commonly used to represent the mean. If you want the mathy details, it’s calculated this way:

$$\mu = \sum_k x_k P(x_k)$$

where $x_k$ is the outcome (i.e. “5”) and $P(x_k)$ is the probability of that outcome (i.e. “11.11%” or 0.1111). For our purposes, though, it’s enough to know that the mean tries to measure the middle of a distribution. If the data is perfectly symmetric (like ours is), it tells you what value is in the center. In the case of our dice, the mean is seven, which is what we’d expect the average to be if we made many rolls.

The standard deviation (usually represented by $\sigma$), on the other hand, describes the spread or width of the distribution. Its definition is a little more complicated than the mean:

$$\sigma = \sqrt{\sum_k P(x_k) (x_k-\mu)^2}$$

But again, for our purposes it’s enough to know that it’s a measurement of how wide the distribution is, or how much it deviates from the mean. A distribution with a larger $\sigma$ is wider than a distribution with a smaller $\sigma$, which means that any given roll could be farther away from the mean. For our distribution, the standard deviation is 2.45.

The thing I want you to note is that neither of these terms tell us anything about error. We aren’t surprised if we roll the dice and get a 10 or 12 instead of a 7. We don’t return them to the manufacturer as defective. The mean and standard deviation tell us a little bit about the range of results we can get when we roll two dice. To talk about error, we need to start looking at actual results of dice rolls, not just the theoretical probability distribution for two dice.

Things Start Getting Dicey

Okay, so let’s pretend we have two dice, and we roll them 100 times. We keep track of the result each time, and plot them on a histogram like so:

The outcome of 100 rolls of two six-sided dies.

Now, this doesn’t look quite the same as our expected distribution. For one thing, it’s definitely not symmetric – there were more high rolls than low rolls. We could express that by calculating the sample mean $\mu_{\rm sample}$, which is the mean of a particular set of data (a “sample”). By calling this the sample mean, we can keep straight whether we’re talking about the mean of the sample or about the mean of entire probability distribution (often called “population mean”). The sample mean of this data set is 7.40, as shown in the upper right hand corner of the plot, which is higher than our expected value of 7.00 by a fair amount.

We can also calculate a sample standard deviation $\sigma_{\rm sample}$ for the data, which again is just the standard deviation of our data set. The sample standard deviation for this run is 2.52, which is a bit higher than the expected 2.45 because the distribution is “broader.” Note that the maximum extent isn’t any wider – we don’t have any rolls above 12 or below 2 – but because the distribution is a little “flatter” than usual, with more results than expected in some of the extremes and fewer in the middle, the sample standard deviation goes up a little.

But note that, by themselves, neither $\mu_{\rm sample}$ nor $\sigma_{\rm sample}$ tell us about the error! They’re still just describing the probability distribution that the data in the sample represents. At best, we might be able to compare our results to the theoretical $\mu$ and $\sigma$ we found for the ideal case to identify how our results differ. But it’s not at all clear that this tells us anything about error. Why?

Because maybe these dice aren’t ideal. Maybe they differ in some way from our model. For example, maybe you’ve heard the term “weighted dice” before? What if one of them is heavier on one side? That might cause it to roll e.g. 6 more often than 1, and give us a slightly different distribution. You could call that an “error” in the manufacturing of the dice, perhaps, but that’s not what we generally mean when we talk about statistical error.

So perhaps it’s time we seriously considered what “error” means. After all, it’s hard to identify an “error” if we haven’t clearly defined what “error” is. Let’s say that we perform an experiment – we make our 100 die rolls and keep track of the results, and generate a figure like the one above. And in addition, let’s say we’re primarily interested in the mean of this distribution; we want to know what the average result of rolling these particular two dice will be. We know that if they were ideal dice, it should be seven. But when we ran our experiment, we got a mean of 7.40.

What we really want to know is the answer to the question, “how accurate is that result of 7.40?” Do we trust it so much that we’re sure these dice are non-standard in some way? Or was it just a fluke accident. Remember, there’s absolutely no reason we couldn’t roll 100 twelves in a row, because each dice roll is independent of the last, and it’s a random process. It’s just really unlikely. So how do we know this value we came up with isn’t just bad luck?

So let’s say the “error” in the sample mean is a measure of accuracy. In other words, we want to be able to say that we’re pretty confident that the “true” value of the population mean $\mu$ happens to fall within the interval $\mu_{\rm sample}-E < \mu < \mu_{\rm sample} + E$, where $E$ is our measure of error. We could call that range our confidence interval, because we feel pretty confident that the actual mean $\mu$ of the distribution for our dice happens to be in that interval. We’ll talk about exactly how confident we are a little bit later.

It should be clear now why comparing our distribution to the “ideal” distribution doesn’t tell us anything about how reliable our results are. We might know that the sample mean differs from the ideal, but we don’t know why. It could be that our dice are defective, but it could also just be a random fluctuation. But since nothing we’ve discussed so far tells us how accurate our measured sample mean is, we don’t know for sure. To get that, we need to figure out how to represent $E$, the number that sets the bounds on our confidence interval.

It’s a common misconception that $E$ should just be the sample standard deviation $\sigma_{\rm sample}$. You may have seen results presented like $\mu \pm \sigma$, or $7.40 \pm 2.52$, to suggest an interval of confidence. That is, generally speaking, not correct. Or at least, very misleading. Because that’s not what the standard deviation means.

What we really want here is something called the standard error, though it’s also commonly called the standard error of the mean.  It’s also sometimes (mistakenly or carelessly) called the “standard deviation of the mean,” but we’ll clarify the difference in a second. I like the term “standard error of the mean,” because it makes it clear that this is a measurement of accuracy of the sample mean. As you might guess, it’s closely related to the sample standard deviation, but not quite the same. It’s calculated by dividing the sample standard deviation by the number of individual “trials,” or dice rolls, $N$:

$${\rm SE_{\mu}} = \frac{\sigma_{\rm sample}}{\sqrt{N}}.$$

This, at long last, is a good measurement of error. It’s worth noting that the standard deviation of the mean is defined similarly, but uses the true standard deviation of the distribution:

$${\rm SD_{\mu}} = \frac{\sigma}{\sqrt{N}}.$$

The reason the two are often used interchangeably is that we generally don’t know what the actual distribution looks like, nor do we know the expected values of $\mu$ and $\sigma$. Sometimes we do, of course; if we have a theory describing the process we’re measuring, then we can often calculate the theoretical values of $\mu$ and $\sigma$. But we don’t always know if our experiment matches the theory as well as we’d like – for example, if one of the dice is weighted and rolls more sixes than ones.

And sometimes, we don’t have a well-described theory at all, we just have a pile of data. This is the case for most Simulationcraft data runs, because we don’t have an easy analytical function that accurately describes your DPS due to any number of factors: procs, avoidance, movement, and so on. In that sort of situation, we can never truly know $\sigma$, so the lines between ${\rm SE}_{\mu}$ and ${\rm SD}_{\mu}$ blur a little bit, and we tend to get sloppy with terminology.

Double Standards

Now, we’ve thrown around a lot of terms that have “standard deviation” in them. It’s no wonder the layperson is easily confused by statistics. So it’s worth spending a moment to make the differences between these terms abundantly clear. Let’s reiterate quickly why we use standard error to describe the accuracy of the sample mean rather than just using $\sigma$ or $\sigma_{\rm sample}$.

We have a theoretical probability distribution describing the result of rolling two 6-sided dice. Here’s what each of the terms we’ve discussed so far tells us:

• The mean (or “population mean”) $\mu$ tells us the average value of a single roll.
• The standard deviation $\sigma$ tells us about the fluctuations of any single dice roll. In other words, if we make a single roll, $\sigma$ tells us how much variation we can expect from the mean. When we make a single roll, we’re not surprised if the result is $\sigma$ or $2\sigma$ away from the mean (ex: a roll of 9 or 11). The more $\sigma$s a roll is away from the mean, the less likely it is, and the more surprised we are. Our distribution here is finite, in that we can never roll less than two or more than 12, but in the general case a probability distribution could have non-zero probabilities farther out in the wings, such that talking about $4\sigma$ or $5\sigma$ is relevant.
• The sample mean $\mu_{\rm sample}$ tells us the average value of a particular sample of rolls. In other words, we roll the dice 100 times and calculate the sample mean. This is an estimate of the population mean.
• The sample standard deviation $\sigma_{\rm sample}$ tells us about the fluctuations of our particular sample of rolls. If we roll the dice 100 times, we can calculate the sample standard deviation by looking at the spread of the results. Again, this is an estimate of the population’s standard deviation, and it tells us how much variation we should expect from a single dice roll.
• The standard deviation of the mean $SD_{\mu}$ tells us about the fluctuations of the mean of an arbitrary sample. In other words, if we proposed an experiment where we rolled the dice 100 times, we would go into that experiment expecting to get a sample mean that’s pretty close to (but not exactly) $\mu$. $SD_{\mu}$ tells us how close we’d expect to be. For example, under normal conditions we’d expect to get a result for $\mu_{\rm sample}$ that is between $\mu-2{\rm SD}_{\mu}$ and $\mu+2{\rm SD}_{\mu}$ about 95% of the time, and between $\mu-2.5{\rm SD}_{\mu}$ and $\mu+2.5{\rm SD}_{\mu}$ about 99% of the time.
• The standard error of the mean $SE_{\mu}$ tells us about the fluctuations of the mean of our particular sample of rolls. Once we actually make those 100 rolls, and calculate the sample mean and sample standard deviation, we can state that we’re 95% confident that the “true” population mean $\mu$ is between $\mu_{\rm sample}-2{\rm SE}_{\mu}$ and $\mu_{\rm sample}+2{\rm SE}_{\mu}$, and 99% confident that it’s between $\mu_{\rm sample}-2.5{\rm SE}_{\mu}$ and $\mu_{\rm sample}+2.5{\rm SE}_{\mu}$

You can see why this gets confusing. But the key is that the standard deviation and sample standard deviation are telling you about single rolls. If you roll the dice once, you expect to get a value between $\mu+2\sigma$ and $\mu-2\sigma$ about 95% of the time.

Whereas the standard deviation of the mean and standard error tell us about groups of rolls. If we make 100 rolls the sample mean should be a much better estimate of the population mean than if we made only a handful of rolls. And if we make 1000 rolls, we should get a better estimate than if we only made 100 rolls.

So we use the standard deviation of the mean to answer the question, “if we made 100 rolls, how close do we expect $\mu_{\rm sample}$ (our sample mean) to be to $\mu$ (the population mean)?” And we use the standard error to answer the related (but different!) question, “now that I’ve made 100 rolls, how accurately do I think my calculated $\mu_{\rm sample}$ (sample mean) approximates $\mu$ (the population mean)?”

You might wonder what voodoo tricks I played to get these “95%” and “99%” values. These come from analysis of the normal distribution, which is a probability distribution that comes up frequently in statistics. If your probability distribution is normal, then about 68% of the data will fall within one standard deviation in either direction. Put another way, the region from $\mu-\sigma$ to $\mu+\sigma$ contains 68% of the data. Likewise, the region from $\mu-2\sigma$ to $\mu+2\sigma$ contains about 95% of the data, and over 99.7% of the data will fall between $\mu-3\sigma$ to $\mu+3\sigma$.

Our probability distribution isn’t a normal distribution. First of all, it’s truncated on either side, while the normal distribution goes on infinitely in either direction (we’ll never be able to roll a one or 13 or 152 with our two dice). Second, it’s a little too discrete to be a good normal distribution – there isn’t quite enough granularity between 2 and 12 to flesh the distribution out sufficiently. It’s really more of a triangle than a nice Gaussian, though it’s not an awful approximation given the constraints. Luckily, none of that matters! As it turns out, the reason our distribution looks vaguely normal is closely related to the reason that we use the normal distribution to determine confidence intervals.

Limit Break

The Central Limit Theorem is the piece that completes our little puzzle. Quoth the Wikipedia,

the central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value and well-defined variance, will be approximately normally distributed.

That’s a bit technical, so let’s break that down and make it a bit clearer with an example. We start with a dice roll (a “random variable”) that has some probability distribution that doesn’t change from roll to roll (“a well-defined expected value and well-defined variance”) and each roll doesn’t depend on any of the previous ones (“independent”). Now we roll those dice 10 times and calculate the sample mean. And then roll another 10 times and calculate the sample mean. And then do it again. And again, and again, and… you get the idea (“a sufficiently large number of iterates”). If we do that, and plot the probability distribution of those sample means, we’ll get a normal distribution centered on the population mean $\mu$.

The beautiful part of this is that it doesn’t matter what the probability distribution you started with looks like. It could be our triangular dice roll distribution or a “top-hat” (uniform) distribution or some other weird shape. Because we’re not interested in that; we’re interested in the sample means of a bunch of different samples of that distribution. And those are normally distributed about the mean, as long as the CLT applies. Which means that when we find a sample mean, we can use the normal distribution to estimate the error, regardless of what probability distribution that the individual rolls obey.

Now, there are two major caveats here that cause the CLT to break down if they aren’t obeyed:

• The random variables (rolls) need to be independent. In other words, the CLT will not necessarily be true if the result of the next roll depends on any of the previous rolls. Usually this is the case (and it is in our example), but not always. There are two wow-related examples I can think of off the top of my head.

Quest items that drop from mobs aren’t truly random, at least post-BC (and possibly post-Vanilla). Most quest mobs have a progressively increasing chance to drop quest items, such that the more of them you kill, the higher the chance of an item dropping. This prevents the dreaded “OMG I’ve killed 8000 motherf@$#ing boars and they haven’t dropped a single tusk” effect (yes, that’s the technical term for it). Similarly, bonus rolls have a system where every failed bonus roll will cause a slight increase in the chance of success with your next bonus roll against that boss. So this would be another example where the CLT won’t apply, because the rolls aren’t truly independent. • The random variables need to be identically distributed. In other words, the probability distribution can’t be changing in-between rolls. If we swapped one of our 6-sided dice out for an 8-sided or 10-sided die, all of the sudden our probability distribution would change and there would be no guarantee that the CLT would apply. You might ask if you could cite either of the two examples of dependence here as examples of non-identical distributions. After all, in each case the probability distribution is changing between rolls. However, that change is due to dependence on previous effects – in a sense, the definition of dependence is “changing the probability distribution between rolls based on prior outcomes.” So dependence is a more specific subset of this category. If either of those things occur, then we can’t be sure that the CLT is valid for our situation. Luckily, none of that applies to our dice-rolling example, so we can properly apply the CLT to estimate the error in our set of 100 rolls. Keep Rollin’ Rollin’ Rollin’ Rollin’ So now that we’ve talked a lot about deep probability theory, let’s actually do that. The standard error of our 100-roll sample is, $${\rm SE}_{\mu} = \sigma_{\rm sample}/\sqrt{N} = 2.52/\sqrt{100} = 0.252$$ To get our 95% confidence interval (CI), we’d want to look at values between$\mu_{\rm sample}-2{\rm SE}_{\mu}$and$\mu_{\rm sample}+2{\rm SE}_{\mu}$, or$7.40 \pm 0.504$. And sure enough, the actual value of the population mean (7.00) falls within that confidence interval. Though note that it didn’t have to – there was still a 5% chance it wouldn’t! We could improve the estimate by increasing the number of dice rolls. For example, what if we rolled 1000 dice instead? That might look something like this: The outcome of 1000 rolls of two six-sided dice. We see that our new sample mean is$\mu_{\rm sample}=6.95$and our sample standard deviation is$\sigma_{\rm sample}=2.41$. But now$N=1000$, so our standard error is much smaller: $${\rm SE}_{\mu} = \sigma_{\rm sample}/\sqrt{N} = 2.41/\sqrt{1000} = 0.0762$$ As before, we’re 95% confident that our sample mean is within$\pm 2{\rm SE} = 0.1524$of the population mean in one direction or the other, and sure enough it is. Of course, we could keep going. Here’s what 10000 rolls looks like: The outcome of 10000 rolls of two six-sided dice. And if we calculate our standard error for this distribution, we get: $${\rm SE}_{\mu} = \sigma_{\rm sample}/\sqrt{N} = 2.43/\sqrt{10000} = 0.0243$$ So now we’re pretty sure that the value of 7.01 is correct to within$\pm 0.0486$, again with 95% confidence. Like before, there’s no guarantee that it will be – there’s still that 5% chance it falls outside that range. But we can solve that by increasing our confidence interval (say, looking at$\pm 3{\rm SE}_{\mu}$) or by repeating the experiment a few times and thinking about the results. If we repeat it 100 times, we’d expect about 95 of them to cluster within$\pm 2{\rm SE}_{\mu}$of 7.00. You may have noticed that while the confidence interval is shrinking, it’s not doing so as fast as it did going from 100 to 1000. That’s because we’re dividing by the square root of$N$, which means that to improve the standard error by a factor of$a$, we need to run$a^2$times as many simulations. So if we want to increase our accuracy by a whole decimal place (a factor of 10), we need to make 100 times as many rolls. This is important stuff to know if you’re designing an experiment, because you don’t want your graduate thesis to rely on making five trillion dice rolls. Trust me. You probably also noticed that the more rolls we make, the more the sample probability distribution resembles the ideal “triangular” case we arrived at theoretically. That’s to be expected – the more rolls we make, the better the sample approximates the real distribution. This is related to another law (the amusingly-named law of large numbers) that’s important for the CLT, but I don’t have time to go into that here. But it was worth mentioning just because “law of large numbers” is probably the best name for a mathematical law ever. Finally, I mentioned that our “triangular” distribution for two dice looks vaguely normal, and that this relates to the CLT somehow. Here’s how. Each die is essentially its own random variable with a “flat” or “uniform” probability distribution (you have an equal chance to roll any number on the die). So when we take two of them and calculate the sum, we’re really performing two experiments and finding two sample means (with a sample size of 1 roll each). The sum of those two sample means, which is just twice the average of the sample means, is our result. This is exactly how we phrased our description of the CLT! The reason we get a triangle rather than a nice Gaussian is that two dice is not “a sufficiently large number of iterates.” There is, unfortunately, no clean closed-form expression for this probability distribution for arbitrary numbers of$s$-sided dice (something called the binomial distribution works when$s\$=2, i.e. for coin flips). But if we rolled 5 dice or 10 dice instead of two, and added all of those up, we’d start to get a distribution that looked very much like a normal distribution. And in fact, if you read either of the articles linked in this paragraph, you’ll see that they both become well-approximated by a normal distribution as you increase the number of experiments (die rolls).

World of Stat-craft?

Now that you’ve read through 4000 words on probability theory, you may ask where the damn World of Warcraft content is. The short answer: next blog post. But as a teaser, let’s consider a graph that shows up in your Simulationcraft output:

A DPS distribution generated by Simulationcraft.

When you simulate a character in SimC, you run some number of iterations. Each iteration gives you an average DPS result, which is essentially one result of a random variable. In other words, each iteration is comparable to a single roll of the dice in our example experiment. If we run a simulation for 1000 iterations, that gives us 1000 different data points, from which we can calculate a sample mean (367.7k in this case), a sample standard deviation, and a standard error value.

And all of the same statistics apply here. This plot gives us the “DPS distribution function,” which is equivalent to the triangular distribution in our experiment. The DPS distribution looks Gaussian/normal, but be aware that there’s no reason it has to be.  It generally will look close to normal just because each iteration is the results of a large number of “RNG rolls,” many of which are independent. But some of those RNG rolls are are not independent (for example, they may be contingent on the previous die roll succeeding and granting you a specific proc, like Grand Crusader). With certain character setups you can definitely generate DPS distributions that deviate significantly from a normal distribution (skewed heavily to one side, for example).

But again, because of the Central Limit Theorem, we don’t care that much what this DPS distribution function looks like. As long as each iteration is independent, we can use the normal distribution to estimate the accuracy of the sample mean. So we can calculate the standard error and report that as a way of telling the user how confident they should be in the average DPS value of 367.7k DPS.

At the very beginning of this post, I said I was looking into a strange deviation from the expected error. What I was finding that my observed errors were larger than what Simulationcraft was reporting. Next time, we’ll look a little more closely into how Simulationcraft reports error, and discuss the specifics of that effect – why it was happening, and how we fixed it.

## 5.4.2 Rotation Analysis

In December, I talked about the code I’ve written to automate the testing of Simcraft profiles. In that post, I tackled the two easiest simulations to write: glyphs and talents. In both of those cases, we’re just editing a single line of the .simc file, so it was a fairly simple job of tweaking that line and repeating. Of course, there was the entire superstructure of code surrounding that idea, which is what took far longer than the (relatively) simple logic required to swap out talents and glyphs.

Today I present the results of the other end of the spectrum – one of the most difficult sims to write. Because today we’re going to look at rotations.

If you haven’t read the previous post, I recommend you go back and do so now.  Or at least re-read the “Automating Simcraft” portion of it. I’ll refresh your memory about certain points, but I’m going to assume that you’re familiar with the basics of how this code operates. In short, if you don’t remember that we piece together a .simc file from discrete components (i.e. a player, a gear set, a rotation, a set of glyphs, a set of talents, etc.), then you should probably go re-read that section.

Note that I’ve taken to calling each of these components “blocks” in the rest of this post. That’s what I tend to call them in my head, and it’s faster than typing “component” over and over. Plus, I think it gives a nice visual – sort of like building the .simc file out of a bunch of different distinct Lego pieces.

Rotations Schmotations

You might ask what makes the rotation sim significantly harder than, say, a glyph sim. The short (and woefully incomplete) answer is that it involves changing more than one line of the .simc file we feed to the executable.

I say “woefully incomplete” because that statement encompasses a lot more than just swapping out a single component.  For example, in the glyph simulation, we kept the same player block, gear block, rotation block, and so on, and just swapped out the glyph block. We did that by pre-generating a glyph block for all of the different glyph combinations we were interested in and cycling through them.

On its face, it seems like that same logic couldn’t apply to the rotation simulation. We could just generate 100 different rotation blocks that describe the different rotations we’re interested in, and then swap them in and out one by one to get the results. Right?

Wrong. Oh, so wrong…

That might work fine for a really simple rotation simulation where we only consider combinations of basic abilities. For example, we limit ourselves to Crusader Strike, Judgment, Avenger’s Shield, Holy Wrath, Hammer of Wrath, and Consecration. That would be enough to figure out the basic gist of the rotation, for sure.

But it should be obvious that this list is missing a few important abilities. What if we want to include Sacred Shield, or one of our level 90 talents? All of those have to go into the rotation somewhere. And the sim won’t use them unless we’ve talented them. So, first of all, that means we need to swap the talent block out at the same time as the rotation block. And not just that, but we need some way to know which talent block to use when – it’s no good if we use a talent block with Light’s Hammer when we’re testing Execution Sentence rotations. That seems like an obvious and trivial problem to solve, but it’s still an extra moving part we need to consider in a sim that’s already going to be pretty complicated.

Because it’s not just talents we need to worry about, either. Let’s say we want to look at execute-range rotations in particular. We might want to know if Holy Wrath changes priority when Final Wrath is glyphed. But to do that, we need to enable that glyph, or else use it by default. But there may be cases where we don’t want it on, either. So we need to be able to swap glyphs too.

Further, we need to be able to specify conditionals in the action priority list (APL). So that, for example, we can compare

/actions+=holy_wrath

with

/actions+=holy_wrath,if=glyph.final_wrath.enabled&target.health.pct<20

Now, of course, that’s not really a problem in theory, because we could just write each block by hand and take care of all of that. But we might have hundreds of rotations, and the risk of making a small, unnoticed but relevant error in one of them is pretty high when you’re talking about writing that many by hand. Also, if you really expected me to write hundreds of rotation files by hand, you’re kidding yourself.

We’ll still need a good shorthand for it for identifying rotations on tables anyway, and if you’re going to write a robust shorthand, then you may as well automatically generate the rotation blocks from that shorthand. That gives us the consistency we want (because there will never be an error in “HW” in one file that doesn’t exist everywhere else) and makes tables easy to read. But it adds another complication: now we need to write a translator that goes between shorthand and full SimC file, complete with all of the options and conditionals we might want to use.

You can already see why this snowballed into one of the more complicated sims to write. And it’s not even necessarily the hardest – the AoE one may be more annoying still depending on what exactly we want to calculate!

The Nitty Gritty Details

So, in short, this is how the simulation works. I’ve divided the rotations we care about up into groups (which, in a sad turn of events, I’ve called “blocks” in the code…. oops? I’ll be consistent about calling them “groups” here though).  Each group has a defined set of talents and glyphs, because for the most part those vary on a group level. So there’s a “Basic” group, an “Execute” group that focuses on Hammer of Wrath and Final Wrath, a “Defensive” group that’s primarily for testing Sacred Shield, and a “Level 90″ group that tests all the level 90 talents.

In addition, I have the ability to enable custom talents per rotation. So for example, within the Level 90 group, it will automatically check each rotation to see which level 90 talent it uses and tweak the talent block to enable that talent. It also does this for the Sacred Shield rotations in the Defensive group. I signify this by adding “+custom” to the end of the talent block, which is the flag the code looks for to decide whether it needs to perform this check.

In theory I could do the same thing with glyphs, I suppose, but I found that I didn’t really need to. It wouldn’t be difficult to modify the code to do that in the future if we decide it’s necessary.

The rest of the difficulty was coming up with the abbreviation scheme for abilities and their conditionals. Thinking ahead, I wanted this to be extendable to other classes, so I set it up such that each class can have its own definitions. For a paladin, CS will always mean Crusader Strike, but if we’re simming another class it could translate to something different.

The abilities were fairly easy, since I’ve been using a standard notation for them in the old MATLAB code for years. They are:

Ability Shorthands
Shorthand Ability
CSw Crusader Strike followed by a /wait (see below)
HotR Hammer of the Righteous
J Judgment
AS Avenger's Shield
HW Holy Wrath
HoW Hammer of Wrath
Cons Consecration
SS Sacred Shield
ES Execution Sentence
LH Light's Hammer
HPr Holy Prism
SotR Shield of the Righteous
EF Eternal Flame
WoG Word of Glory

In the earlier code, we used a bracketing technique for options, which was very powerful, but led to really long rotation names.This time around, I’m trying to keep the names fairly compact for display purposes, so I went with a slightly different method. Each option has a shorthand and gets appended to the ability shorthand with a plus sign (‘+’). The options I have enabled at this point are:

Conditional Shorthands
Shorthand Conditional
W# add a /wait after the ability if the cooldown is less than or equal to # seconds
DP buff.divine_purpose.react
DPHP# (buff.divine_purpose.react|holy_power>=#)
ex target.health.pct<=20
FW glyph.final_wrath.enabled&target.health.pct<=20
HP# holy_power>=#
nt !ticking
nF target.debuff.flying.down
SW talent.sanctified_wrath.enabled&buff.avenging_wrath.react
T# active_enemies>=#
R# buff.(ability_string).remains<#

So for example, AS+GC would translate into

/actions+=avengers_shield,if=buff.grand_crusader.up

Not all of these are in use in the data I’ll present today, but they’re all coded and potentially usable. I expect that we’ll add a bunch of action priority lists to the simulation after we’ve analyzed the results in this post. For example, it might be interesting to see if “Cons+nt” has any effect, but it wasn’t high on my list of priorities when I was putting this together so I didn’t include it.

There’s one special case I want to mention. The “wait” conditional works something like this: CS+W0.35 translates to:

/actions+=crusader_strike
/actions+=wait,sec=cooldown.crusader_strike.remains,if=cooldown.crusader_strike.remains>0&cooldown.crusader_strke.remains<=0.35

As you might expect from the default APL for protection, this almost always nets an increase in holy power generation because it prevents us from doing silly things like CS-X-X-X-CS. That can otherwise happen in situations where one or more of the X’s were spells, so the GCD ends a little before CS becomes available. As a result, we’ll almost always want to follow CS with a wait. Since that comes up a lot, and I didn’t want to type CS+W0.35 all the time in the interest of keeping the rotation abbreviations short and readable, I’ve defined the shorthand “CSw” to implicitly mean “CS+W0.35″

As a final note, I want to mention that this simulation is limited to GCD-based abilities. In other words, I’m using the same precombat actions and the same finishers in each rotation. I’m basically bolting the rotations below together with the precombat actions and the following default finisher definitions:

actions+=/eternal_flame,if=talent.eternal_flame.enabled&(buff.eternal_flame.remains<2&buff.bastion_of_glory.react>2&(holy_power>=3|buff.divine_purpose.react))
actions+=/shield_of_the_righteous,if=holy_power>=5|buff.divine_purpose.react|incoming_damage_1500ms>=health.max*0.3

This ensures that the changes we see are purely due to any change in holy power generation or dead time in the rotations themselves. And in any event, since our active mitigation is decoupled from the GCD, it’s not really part of our “rotation” in a strict sense. It’s stuff we use when necessary and available based on the resources, not based on whether they’re more or less important than e.g. CS.  We’ll analyze the finisher options specifically in a later sim in much the same way we do here for the rotation. Luckily, that sim will be a lot easier to write!

As usual, all of the code can be found in the matlabadin repository. This sim uses a lot of files, but the master one that controls it all is:

All of the results can be found in the /io/ directory, along with the results of the glyph and talent simulations. The sims are labeled appropriately with “>” replaced by “_”.

Results

We’ll go through each of the rotation groups one at a time, briefly discussing what makes them unique and why we’ve made the choices we have.They all use the default T16N profile gear set (which includes 4T16) and are pitted against the T16N25 TMI calibration boss. The default talents include Unbreakable Spirit, Eternal Flame, and Divine Purpose unless otherwise specified. Everything else should be provided in the details below.

I’ll note that for all of these simulations, I’ve set the number of iterations to 250k. Yes, that’s a lot, but it’s necessary to get the degree of accuracy we want.

The “DPS Error” that Simulationcraft reports is really the half-width of the 95% confidence interval (CI). In other words, it is 1.96 times the standard error of the mean. To put that another way, we feel that there’s a 95% chance that the actual mean DPS of a particular simulation is within +/- DPS_Error of the mean reported by that simulation. There are some caveats to this statement, insofar as it makes some reasonably good but not air-tight assumptions about the data, but it’s pretty good.

I’m actually doing a little statistical analysis on SimC results right now to investigate some deviations from this prediction, but that’s enough material for another blog post, so I won’t go into more detail yet. What it means for us, though, is that in practice I’ve found that when you run the sim for a large number of iterations (i.e. 50k or more) the reported confidence interval tends to be a little narrower than the observed confidence interval you get by calculating it from the data.

So for example, at 250k iterations we regularly get a DPS Error of approximately 40. In theory that means we feel pretty confident that the DPS we found is within +/-40 of the true value. In practice, it might be closer to +/- 100 or so.

Why does that matter for us? Well, we want to know if one rotation is better than another in a statistically significant sense. Based on the theoretical estimate, this means that as long as they’re farther apart than 80 DPS, we can trust that the higher-DPS rotation is better. In practice, I think we should expand that bound a bit, at least to 100 DPS, and probably to 200 DPS if we’re going to be generous and assume that there could be other sources of systematic error that we don’t know about. I’ve seen the same rotation sim up to 300 DPS differently from two separate runs, so I’m inclined to be a little more generous in my error estimate than SimC is.

And keep in mind that we’re looking at a mean value of almost 400k DPS in these sims. 400 DPS is a change of 0.1%, which is miniscule, and not likely to swing an encounter one way or another. Even if our sims are accurate to that level, that’s right around  the point where you prioritize mental bandwidth over DPS gain and choose the rotation that’s simpler to execute. So I’d probably be hesitant to ascribe any real significance to differences that are smaller than 1000 DPS, which is still less than a 1% change.

Basic Rotation Group

This group of rotations is focused on determining the order of operations for our basic abilities, excluding talents and execute range. From this, we determine our “ideal” base rotation, which we then go about tweaking in the other groups.

In this set, we use just two glyphs: Focused Shield and Word of Glory. We could have included Divine Protection, but we want to be able to compare the survivability results to those obtained in later groups which use all three glyph slots on DPS glyphs. Plus, there’s really not a lot to learn from glyphing Divine Protection here. It’s our only feasible survivability glyph and it’s so highly situational that there’s no guarantee we’re using it for a given boss.

In addition to the table, the sim spits out the maximum DPS Error measurement of the group (each rotation is fairly similar in that regard, so it didn’t make sense to include it on the table) and the talents and glyphs used:

Max DPS Error: 41
Talents: 312232
Glyphs: focused_shield/word_of_glory

Basic Rotations
Rotation DPS HPS DTPS TMI Var SotR Wait
CS>J>AS>Cons>HW 373603 160013 160353 6212 2062 71.0% 14.5%
CS>J>AS>HW>Cons 379608 159521 159854 4287 971 71.4% 13.2%
CS+W0.3>J>AS>HW>Cons 373814 157738 158054 533 117 73.2% 12.7%
CSw>J>AS>Cons>HW 368204 157862 158182 460 47 73.1% 13.8%
CSw>J>AS>HW>Cons 373591 157666 157983 427 55 73.2% 12.7%
CSw>J>HW>AS>Cons 372798 157616 157932 410 77 73.3% 12.5%
CSw>HW>J>AS>Cons 359765 161552 161890 626 61 69.6% 15.6%
HW>CSw>J>AS>Cons 363466 161565 161905 806 122 69.6% 14.9%
CSw>AS>J>HW>Cons 373952 158483 158804 451 38 72.5% 12.8%
J>CSw>AS>HW>Cons 368396 162575 162942 90576 83031 68.5% 17.4%
J>AS>CSw>HW>Cons 372886 163157 163529 61525 35811 67.9% 17.0%
AS>J>CSw>HW>Cons 372490 163965 164342 174459 57861 67.2% 17.4%
AS>CSw>J>HW>Cons 378485 159759 160092 1971 365 71.2% 13.0%
HotR+W0.35>J>AS>HW>Cons 371877 159558 159894 6714 5015 71.4% 13.2%
AS+GC>CSw>J>AS>HW>Cons 374633 157958 158289 727 145 72.9% 12.7%
CSw>AS+GC>J>AS>HW>Cons 373734 157767 158086 409 52 73.1% 12.7%
CSw>AS+GC>J>HW>AS>Cons 373243 157700 158021 391 43 73.2% 12.5%
CSw>AS+GC>J>HW>Cons>AS 372838 158084 158405 429 79 72.8% 12.3%

Note that you can sort the table by a particular column by simply clicking on that column’s header. The “Var” column simply reports the measurement of “TMI Error,” which is really more of an uncertainty or variance measure due to the nature of the TMI distribution. Basically, treat that column as the +/- on the measured TMI value. The “Wait” column tells us how much time the sim spends waiting while the GCD is available, either because there’s nothing to cast or because we’re hitting the /wait action.

Before sorting, it’s clear that waiting for CS’s cooldown to come up is a significant survivability gain. The more subtle thing to notice is that it’s actually a slight DPS loss, mostly because CS hits like a limp noodle. There are a number of reasons for that, but the primary one is that CS’s damage increases far more slowly with attack power than the rest of our abilities do. So the higher Vengeance gets, the worse CS is compared to just about everything else we could cast.

A lot of the features here are expected. Dropping CSw below anything else in priority gives you a large survivability loss. It’s worth noting that the “CSw>AS+GC>J>*” rotations near the bottom produce some very low TMI results, but I’m still a bit skeptical of these. The SotR uptime isn’t any higher than the default (CSw>J>AS>HW>Cons), nor are the TMI values lower in a statistically significant sense.

If we sort by DPS, we see that the top rotation is actually the one where we don’t wait for CS’s cooldown, again because CS is such a weak ability at this point. But after that one, we have a bunch of rotations that emphasize AS in various ways. This can be summarized with a pretty simple rule of thumb: “if you don’t care about survivability and need max DPS right now, prioritize AS.”

There are a bunch of rotations where I push Holy Wrath up ahead of CS/J/AS. These aren’t interesting from a survivability point of view, because they uniformly increase our TMI. They also seem to uniformly reduce DPS compared to the standard CSw>J>AS>HW>Cons. We’ll have to revisit these in the execute range group where we have Final Wrath glyphed, which is where we might expect a high HW prioritization to bear fruit.

The HotR rotation I threw in has the same wait as CSw, so it’s directly comparable to a CSw rotation. This is really only relevant in cases where you want to know how much single-target damage you’re sacrificing to cleave to adds now that Weakened Blows is applied by both abilities. Nonetheless, we see it’s about a 1700 DPS loss to use HotR instead of CS. Not really a big deal in the grand scheme of things, we’re talking about less than a 1% difference. CS and HotR both hit so weakly it’s almost irrelevant which you use.

I also want to call attention to the TMI and Var columns again quickly. If you sort by either of these, you’ll see that as TMI goes up, so does the variance. This is one significant drawback of the current TMI formula – because it’s an exponential metric, the variance tends to be rather large when TMI is large. Increasing the number of iterations doesn’t end up helping it much, because it’s just not anything resembling a Gaussian distribution.

The two take-home messages I want to get across here are:

• Unless two TMI values differ by more than the sum of their Var columns, it’s not 100% clear that they’re different in a statistically significant sense. So TMIs of 400 and 500 are roughly identical if their Vars are 100 or more, but you could safely say that a TMI of 400 is better than e.g. a TMI of 1000. We’re looking for order-of-magnitude effects in TMI, because that’s how the metric was constructed.
• This will be fixed in TMI v2.0, which I’m working on currently. More on that soon, maybe next week if I have time to write.

Next, let’s look at the execute rotations.

Execute Rotation Group

In this case, we want to find out how we vary the basic CSw>J>AS>HW>Cons rotation in execute range. That means we need to know where to slot in Hammer of Wrath and what (if anything) to do about Holy Wrath when Final Wrath is glyphed.

Since we can already look at the table above to figure out what happens when Final Wrath isn’t glyphed, this group includes it by default along with Focused Shield and Word of Glory.

Max DPS Error: 41
Talents: 312232
Glyphs: focused_shield/word_of_glory/final_wrath

Execute Rotations
Rotation DPS HPS DTPS TMI Var SotR Wait
CSw>J>AS>HW>Cons>HoW 383714 157727 158045 379 30 73.2% 11.2%
CSw>J>AS>HW>HoW>Cons 384536 157678 157999 436 111 73.2% 11.0%
CSw>J>AS>HoW>HW>Cons 383834 157566 157879 431 121 73.3% 10.9%
CSw>J>HoW>AS>HW>Cons 383380 157868 158196 529 135 73.0% 10.9%
CSw>HoW>J>AS>HW>Cons 383612 157968 158297 519 123 72.9% 10.9%
HoW>CSw>J>AS>HW>Cons 383963 158370 158738 2348 761 72.5% 11.0%
CSw>J>HW+FW>AS>HW>HoW>Cons 384673 157751 158072 397 41 73.2% 11.0%
CSw>J>AS+GC>HW+FW>AS>HW>HoW>Cons 384381 157701 158020 416 59 73.2% 11.0%
CSw>HW+FW>J>AS>HW>HoW>Cons 384846 158089 158426 458 65 72.8% 11.0%
HW+FW>CSw>J>AS>HW>HoW>Cons 385184 158096 158435 632 229 72.8% 11.0%

We can clearly see that Hammer of Wrath should slot in ahead of Consecration but behind Holy Wrath. TMI values vary somewhat depending on how far ahead of other abilities you put it, but note that HoW>J and J>HoW don’t differ much because both are 6-second cooldowns, so they don’t generally clash all that often.

However, if we push HoW ahead of CSw we get a significant TMI increase without realizing any sort of DPS gain compared to slotting it behind Holy Wrath. This is a little different than the results we got with the old MATLAB sims in 5.2, which suggested HoW was a DPS increase at the top of the priority queue. My guess is that the change is due to two factors: switching our L45 talent from SS to EF and losing Grand Crusader procs.

In 5.2, we had fewer empty GCDs because we’d be refreshing Sacred Shield every 30 seconds and using up more Grand Crusdaer procs, which ended up leaving less room for Hammer of Wrath and other fillers. Now, we have a larger number of empty GCDs to work with, so using Hammer of Wrath doesn’t necessarily push another filler back multiple cycles. And since we have those extra GCDs more regularly, it’s not worth pushing it ahead of the basic CS-J cycle; it’s just more efficient to slot it back in wherever it fits without delaying heavy-hitters like AS and Final-Wrath-Glyphed Holy Wrath (can we just call it “Final Wrath” in execute range?).

Speaking of Final Wrath, it looks like that does hit hard enough to be a DPS increase at the front of the queue, for relatively little cost in TMI. The CSw>J>AS+GC>HW+FW>AS>HW>HoW>Cons rotation is particularly interesting in that it gives you a small (~300) DPS boost without sacrificing any holy power generation. But at the same time that difference is right at (or below) our error threshold, so it’s not clear that’s a realizable gain. By the time we’re looking at 0.1% DPS increases, we’re splitting more hairs than we probably should.

So the conclusion here seems to be that the filler order ought to be HW>HoW>Cons, and during execute range you can prioritize “Final Wrath” as high as you want for a DPS gain, realizing that you’re sacrificing a little survivability if you use it instead of a holy power generator.

Next up: Defensive rotations.

Defensive Rotation Group

While I called this the “Defensive” category, it should really just be called the “Sacred Shield” category, since that’s the only defensive spell in here. And with EF being so strong in Siege of Orgrimmar, it’s also mostly irrelevant. But I’m including it for completeness, and to highlight how strong EF really works out to be.

One oversight here is that this group doesn’t take advantage of the T16 4-piece. The default finisher block has lines for SotR usage and Eternal Flame maintenance, but there’s nothing in there for Word of Glory. As a result, we expect to see a drop in SotR uptime corresponding to losing the 4-piece bonus, as well as an increase in TMI. In the future, I’ll be adding a line like this:

actions+=/word_of_glory,if=talent.eternal_flame.disabled&tlent.divine_purpose.enabled&buff.bastion_of_power.react>=3

which should appropriately use WoG whenever we have 3 stacks of the 4T16 buff to fish for extra Divine Purpose procs. For now, just keep in mind that the results in this group aren’t strictly comparable to the ones with EF for survivability purposes. However, they should still be accurate for comparing the SS rotations against one another if you’re hell-bent on running Sacred Shield.

Max DPS Error: 41
Talents: 313232
Glyphs: focused_shield/word_of_glory/final_wrath

Sacred Shield Rotations
Rotation DPS HPS DTPS TMI Var SotR Wait
CSw>J>AS>HW>HoW>Cons>SS 367667 111879 120831 230953 15250 68.8% 3.3%
CSw>J>AS>HW>HoW>SS+R1>Cons 370778 113784 122153 151427 5737 69.9% 8.9%
CSw>J>AS>HW>HoW>SS+R1>Cons>SS 367419 111758 118767 116705 3947 68.8% 3.3%
CSw>J>AS>HW>SS+R1>HoW>Cons>SS 367205 111699 118253 99323 3247 68.8% 3.3%
CSw>J>AS>SS+R1>HW>HoW>Cons 368861 113136 119949 100877 3655 70.0% 9.1%
CSw>J>AS>SS+R1>HW>HoW>Cons>SS 366914 111684 118101 98289 3051 68.8% 3.3%
CSw>J>AS+GC>SS+R1>AS>HW>HoW>Cons>SS 366800 111667 118048 98425 4289 68.8% 3.3%
CSw>J>AS>SS+R2>HW>HoW>Cons>SS 367064 111661 118041 95695 2790 68.8% 3.3%
CSw>J>AS>SS+R3>HW>HoW>Cons>SS 366863 111692 118097 98717 2810 68.8% 3.3%
CSw>J>AS>SS+R4>HW>HoW>Cons>SS 366868 111676 118065 96511 2942 68.8% 3.3%
CSw>J>AS>SS+R5>HW>HoW>Cons>SS 366886 111665 118051 96597 3131 68.8% 3.3%
CSw>J>AS+GC>SS+R1>AS>HW>HoW>Cons 368770 112856 119307 87275 2789 70.0% 9.1%
CSw>J>SS+R1>AS>HW>HoW>Cons 368547 112834 119190 87898 2947 70.0% 9.1%
CSw>SS+R1>J>AS>HW>HoW>Cons 368234 112800 119149 96260 5332 69.7% 9.2%
SS+R1>CSw>J>AS>HW>HoW>Cons 368466 112689 119040 95784 7286 69.7% 9.1%

First, notice that all of the TMI values on this table are in the 100k range, compared to ~400 when we use Eternal Flame. Some of that is the utter dominance of EF over SS at high AP/Vengeance, some of it is because the Shield of the Righteous uptime is lower by a few percent because we’re no longer leveraging the 4-piece bonus. Note that our SotR uptime is a little higher here than the ~64% range we saw in the 4T16 post; we’re averaging around 69% instead.

You might wonder why that is – after all, in that earlier post we said the 4T16 benefit is about 10% SotR uptime, and we’re not taking advantage of the 4-piece in this group of sims. However, when we talent Sacred Shield we also don’t have to maintain Eternal Flame, which means we can spend that holy power on SotR instead, making up about half of the difference. If we were fishing for extra DP procs with Word of Glory, SotR uptime should actually catch up to what we get with Eternal Flame.

In any event, there’s not a lot to say here. TMI obviously improves as we increase the priority of refreshing SS (“SS+R1″ means “refresh SS if it’s got less than 1 second left”), but there’s no advantage to putting it ahead of CS or J. I added the CSw>J>AS+GC>SS+R1>AS>HW>HoW>Cons option at the last minute on a hunch, as I suspected that would truly be the low-TMI option after looking at the rest of the results, and it paid off. I’m not entirely sure why this performs better than the identical rotation with an exra “>SS” tacked onto the end though. It’s clear that it’s causing some kind of holy power generation loss based on the SotR uptimes, but I don’t really see how.  Something to investigate for later, I guess.

I also want to draw attention to the fact that refreshing it at 2 seconds early seems to be the sweet spot. One second puts it off long enough that sometimes you get short gaps due to the GCD. Three seconds or longer tends to be no more effective than two seconds. I don’t know offhand why the SS+R3 version scored so poorly, but again, it could just be RNG given the Var column is nearly 3000.

That’s enough about Sacred Shield, let’s move on to the level 90 talents.

Talent Rotations Group

This is the fun group, where we make use of our “+custom” talent flag. Basically, we’re just swapping the L90 talent appropriately so that we have the ability the rotation calls for.

There are two things we’re checking for in this sim. First, what’s the “default” place to slot each talent into the rotation, ignoring what section of the encounter we’re in. Then, we want to try and fine-tune that by specifying execute rotations to see if there’s an advantage to increasing the priority during execute. We might care about that because once Hammer of Wrath becomes available, we don’t have that many empty GCDs to work with, so we could inadvertently ignore a L90 talent (or at least delay it for a long time) if we slot it behind Hammer of Wrath.

I’ve decided to split this group up into three tables for ease of filtering/sorting.

Max DPS Error: 41
Talents: 312232+custom
Glyphs: focused_shield/word_of_glory/final_wrath

Execution Sentence Rotations
Rotation DPS HPS DTPS TMI Var SotR Wait
CSw>J>AS>HW>HoW>Cons>ES 404748 157849 158167 591 332 73.1% 9.6%
CSw>J>AS>HW>HoW>ES>Cons 406988 157828 158147 454 110 73.1% 9.7%
CSw>J>AS>HW>ES>HoW>Cons 407136 157867 158187 406 46 73.1% 9.7%
CSw>J>AS>ES>HW>HoW>Cons 406419 157757 158077 517 235 73.2% 9.9%
CSw>J>ES>AS>HW>HoW>Cons 406392 157809 158126 433 48 73.1% 9.9%
CSw>ES>J>AS>HW>HoW>Cons 405022 157857 158175 539 117 73.0% 10.0%
ES>CSw>J>AS>HW>HoW>Cons 405441 158113 158437 665 134 72.7% 10.0%
CSw>J>AS>ES+ex>HW>ES>HoW>Cons 407045 157824 158144 432 91 73.1% 9.7%
CSw>J>AS+GC>HW>AS>ES>HoW>Cons 405890 157756 158075 452 125 73.1% 9.7%
CSw>J>AS+GC>HW+FW>AS>HW>ES>HoW>Cons 406777 157838 158157 385 35 73.1% 9.8%

Light's Hammer Rotations
Rotation DPS HPS DTPS TMI Var SotR Wait
CSw>J>AS>HW>HoW>Cons>LH 393512 157962 158280 308 22 73.1% 9.6%
CSw>J>AS>HW>HoW>LH>Cons 393944 158002 158310 329 51 73.1% 9.8%
CSw>J>AS>HW>LH>HoW>Cons 394156 158013 158320 327 80 73.1% 9.8%
CSw>J>AS>LH>HW>HoW>Cons 393962 157918 158221 316 38 73.2% 10.0%
CSw>J>LH>AS>HW>HoW>Cons 394201 158002 158305 324 59 73.1% 10.0%
CSw>LH>J>AS>HW>HoW>Cons 394363 158040 158344 326 50 73.0% 10.0%
LH>CSw>J>AS>HW>HoW>Cons 394678 158286 158595 421 89 72.8% 10.0%
CSw>J>AS>LH+ex>HW>LH>HoW>Cons 393949 158018 158323 279 19 73.1% 9.8%
CSw>J>AS+GC>HW>AS>LH>HoW>Cons 392669 157944 158250 314 34 73.1% 9.7%
CSw>J>AS+GC>HW+FW>AS>HW>LH>HoW>Cons 393873 158029 158335 299 33 73.0% 9.8%
Holy Prism Rotations
Rotation DPS HPS DTPS TMI Var SotR Wait
CSw>J>AS>HW>HoW>Cons>HPr 396882 158186 158505 339 27 72.9% 7.6%
CSw>J>AS>HW>HoW>HPr>Cons 396093 158345 158655 336 37 72.8% 7.8%
CSw>J>AS>HW>HPr>HoW>Cons 395777 158348 158657 363 63 72.7% 7.9%
CSw>J>AS>HPr>HW>HoW>Cons 394603 158170 158476 300 23 72.9% 8.0%
CSw>J>HPr>AS>HW>HoW>Cons 394422 158173 158479 383 65 72.9% 8.0%
CSw>HPr>J>AS>HW>HoW>Cons 393947 158426 158732 343 35 72.7% 8.1%
HPr>CSw>J>AS>HW>HoW>Cons 395775 159554 159877 521 74 71.6% 8.2%
CSw>J>AS>HPr+ex>HW>HPr>HoW>Cons 395674 158362 158669 369 73 72.8% 7.9%
CSw>J>AS+GC>HW>AS>HPr>HoW>Cons 395192 158255 158564 463 115 72.9% 7.7%
CSw>J>AS+GC>HW+FW>AS>HW>HPr>HoW>Cons 395805 158399 158707 388 57 72.7% 7.9%

First, it’s clear that Execution Sentence is our damage option, with Holy Prism trailing it slightly and Light’s Hammer coming in at a close third place.

Execution Sentence seems to be a toss-up with Hammer of Wrath initially, with them neck and neck at around 407k DPS. But the ES>HoW version is far enough ahead that I’m willing to believe it’s a little better, but again, we’re talking about differences that are right on the boundary of our error level. Still, level 90 talents are more fun than Hammer of Wrath, and when two rotations come this close in DPS that’s as good a criterion as any. This basically boils down to “ES>HoW” during execute range, since outside of execute the two rotations are identical. In other words, the ES rotation should be:

CSw>J>AS>HW>ES>HoW>Cons

None of the tweaked versions that prioritize things differently in execute range give us a significant improvement over that rotation, so we can rule them out.

Curiously, with LH we would be lead to believe that prioritizing it above AS, J, or even CS is a DPS increase. That doesn’t make a lot of sense though: ES hits harder, yet those rotations didn’t exhibit this same behavior. At this point, I’m inclined to believe that something fishy is going on here. I’d call it an outlier, even though it’s several hundred DPS ahead of some of the other options, but it’s not just one rotation. All three of the rotations with LH near the top show the same effect. I’m not sure why that’s happening yet.

That said, if we ignore those three, perhaps on the grounds that it’s a HPG loss, then the same rotation that maximizes Execution Sentence is the best choice here as well. The CS>J>AS>HW>LH>HoW>Cons rotation is the strongest performer when we’re not putting LH above holy power generators. Though again, the difference between that rotation and LH>HW>HoW or HW>HoW>LH is so small that any of them would be fine.

Holy Prism is an odd duck. It seems to enjoy – no, relish even – hanging out in the last spot. Moving it anywhere higher in the queue is a loss of 800 DPS or more, a large enough gap that we can feel pretty certain it’s statistically significant. Even playing some execute-range tricks with it doesn’t help.

This is actually pretty easy to explain. Consider the following three charts of damage per execute time (DPET) for the rotations CSw>J>AS>HW>HoW>Cons>L90:

DPET for CSw>J>AS>HW>HoW>Cons>ES

DPET for CSw>J>AS>HW>HoW>Cons>LH

DPET for CSw>J>AS>HW>HoW>Cons>HPr

Note that the DPET on Execution Sentence is far higher than any of our other spells, which is why it’s worth prioritizing ahead of HoW and Cons. The only reason it isn’t worth pushing higher in the queue is that we have enough gaps in our rotation that it’s better to use high-damage, low-cooldown spells like Holy Wrath first to minimize empty GCDs.

The DPET on Light’s Hammer is lower than ES, but still above everything else, so most of the same logic applies. Again, with the weird unexplained exceptions that we talked about earlier, which I’m likely chalking up to error (either in the LH results or in the ES results – I’m not really sure which!).

But the DPET on Holy Prism is only on par with Hammer of Wrath and Consecration. This is mostly because it doesn’t scale as well with attack power as the other Level 90 options. Or to state that more precisely, the spell’s attack power coefficient is similar to that of Consecration and Hammer of Wrath (all around 0.7ish, I believe), so in the high-Vengeance regime they all do about the same damage. Light’s Hammer and Execution Sentence have significantly larger attack power coefficients, and thus do a lot more damage in that regime.

Now, it’s worth noting that this doesn’t mean Holy Prism is badly balanced. The cooldown of Holy Prism is only 20 seconds, compared to 60 seconds for ES and LH. In theory, you could get 3 casts of Holy Prism off in the same time that you cast one of either of the other level 90 talents. And those three Holy Prisms would total more damage than a single LH cast, though less than a single ES.

But those three Holy Prisms also cost three GCDs to the single GCD used by LH or ES. And that hurts Holy Prism in the “rotation priority” department, because it means we’re far more likely to be pushing something else back, effectively extending the cooldown of another spell and cutting into the DPS gain.

And if you have three spells that do very similar amounts of damage, one with a sub-6-second cooldown (HoW), one with a sub-9-second cooldown (Cons), and one with a 20-second cooldown (HPr), which one do you use first? Generally speaking, the ones with the shortest cooldown, because you usually lose less DPS by pushing the long-cooldown spell back than you do by pushing the short-cooldown spells back. See Wrath-era Retribution theorycrafting for another example of this, where Crusader Strike was prioritized over harder-hitting spells simply because its cooldown was much shorter.

Another reason for the discrepancy in DPET (and DPS) in our L90 talents is that LH and Holy Prism have some utility that they’re being balanced around. Both spells do a good bit of healing. Light’s Hammer works a lot like a raid cooldown, while Holy Prism does less of it but does it up-front. Holy Prism also has availability going for it, in that you can use it more frequently – something that anyone who’s tried to pick up a group of loose adds will recognize as a life saver.

In any event, that was a slight tangent; the take-home message of this last table is that Holy Prism gets to bring up the rear in our priority list.

Conclusions

This was an incredibly long post, and I didn’t even begin to go over the results in the sort of detail that I could given more time. But I’m pretty sure I hit most of the more important things. Still, it’s worth summarizing what we learned, or at least reinforced.

From this data, we’d ideally want to follow this rotation:

CSw>J>AS>HW>ES>LH>HoW>Cons>HPr

with the caveat that I’m obviously assuming you’re taking Eternal Flame instead of Sacred Shield, that you’re not doing any fancy Holy Wrath prioritization during execute range, that you’re glyphing Focused Shield and Glyph of Word of Glory, and that you’re ignoring whichever two talents you don’t currently have chosen.

We know that not waiting up to about a third of a second for CS to come off of cooldown is a notable DPS gain, as is prioritizing Avenger’s Shield. In both of those cases we suffer a noticeable decrease in survivability, however.

We also know that it’s a small DPS gain to push Holy Wrath higher in the queue during execute range if we’re using the Final Wrath glyph, but that this comes with a small survivability loss as well.

We know that if we’re taking Sacred Shield, we want to slot it in somewhere in the filler section to refresh it when the duration is almost up. It should probably be a gain to tack it on to the end of the queue to fill an empty GCD as well, but the data is inconclusive here, so the jury’s still out on that one.

And of course the Level 90 talent results are already incorporated into the rotation given above.

It’s also worth noting what we didn’t check here and to be clear about the limitations of this data set. We haven’t attempted to try any additional L75 talent options, so all we have is Divine Purpose data. Holy Avenger shouldn’t vary things too much, insofar as most of HA’s effect is simply more off-GCD SotR spammage. But it could cause rotations that try to increase DPS by prioritizing something over a holy power generator to fail miserably because each holy power generator is basically adding 2/3 of a SotR in damage during HA. On the other hand, they’re also less effective outside of HA than they would be with Divine Purpose, so who knows! Note also that we’re tanking the boss full-time here, so the effective uptime of Holy Avenger isn’t being considered.

Likewise, we didn’t test how Sanctified Wrath affects things. We’re already fairly sure that pushes a J+SW up ahead of CSw in priority, but we don’t know if it changes filler priority at all. Those are all on the list of things to add for next time.

We’re also simming the most bland encounter possible: solo-tanking Patchwerk forever. There’s no movement, no sudden or predictable damage bursts from a boss special, no significant variation in damage patterns (i.e. it’s a steady stream of melee+DoT damage, not an oscillating pattern of heavy melee followed by heavy magic followed by heavy melee and so on…). Basically none of the things that make real encounters interesting.

So keep that in mind when interpreting the results. I may say “this data suggests X is better than Y,” but I’m always doing that within the context of this particular set of constraints. It’s reasonable to assume that it generalizes fairly well to other situations, but it won’t always, and it’s almost certainly not going to be iron-clad enough to be correct for every encounter.

As usual, a smart tank should be looking for those inconsistencies and adapting their play to the encounter rather than blindly relying on “but Theck said so!”

## 5.4.2 WeakAuras Strings

It’s been a little while since 5.4 was released, and I’ve still been tweaking my WeakAuras here and there as I go. I’ve finally made enough tweaks that I thought it was worth sharing with the class.

Again, the updated Paladin auras can all be grabbed at http://www.sacredduty.net/weakauras-strings/ along with auras for all of the other class/spec combinations I use regularly.

Weakened Blows

The first change is actually the removal of an aura that doesn’t have much meaning anymore. Now that Crusader Strike applies Weakened Blows, there’s no reason to be tracking it’s uptime. So I’ve removed that aura entirely and shifted the Eternal Flame & Sacred Shield icons over to fill the empty space.

Priority Row Shuffle

I’ve tweaked the order of the spells on the priority row a bit. I’ve been using Holy Prism more frequently lately – or to be more specific, I’ve been swapping that talent more frequently and using all three choices. Nowadays, my sims are telling me that all three of these choices fall above Consecration in priority. And more importantly, I found that I tended to forget about those spells since I had their icons so far off to the right.

So I’ve re-ordered the last few icons on the priority row. I’ve moved Execution Sentence, Light’s Hammer, and Holy Prism to the left and moved Consecration and Sacred Shield to the right.  Now the order looks like:

CS – J – AS – HW – HoW – (ES/LH/HPr) – Cons – SS

Swapping Consecration and Sacred Shield is a last-minute change that I made after recording the videos (and taking the screenshots) shown later in this post, so in those Consecrate appears to be way off to the right. It looks a little cleaner now with that last-minute change though (which is why I decided to make it!).

Tier 16 4-piece Indicator

The first new indicator is one that tells you the status of your Tier 16 4-piece bonus. When you reach 3 stacks of Bastion of Glory, you get a buff called Bastion of Power that makes your next Word of Glory or Eternal Flame free. It’s a very simple matter to track this buff in WeakAuras so you know when it’s available.

The indicator pulses if you have a 5-stack of Bastion of Glory to remind you that it’s at full strength. As per this comment on last week’s blog post, refreshing the buff immediately at 5 stacks of BoG tends to be an ideal strategy in the steady-state (i.e. at constant Vengeance). In practice you’d want to consider your current Vengeance level, of course.

The two new indicators I’ve added in 5.4.2, also showing the adjusted layout now that the Weakened Blows indicator is gone.

Eternal Flame Stoplight

To that end, I’ve added a new indicator. In this comment, Zil asked me if I could write an aura that would tell you whether refreshing Eternal Flame would give you a larger or smaller HoT. So I wrote up this “stoplight” indicator to give us that information.

Every time you cast EF, it calculates the strength of that EF and stores that value (much like the text indicators store the effective HP used, BoG level, haste, and AP you had at the time of the cast). It then calculates the value of a new EF given the current conditions and compares that to the stored value.

If the new value is at least 10% larger than the existing one, the indicator turns green. If it’s not, it stays red. Yes, this is the reverse of how I have the text indicators working, but if that bothers you the colors are easily tweaked on the display tab of each aura. I tend to think of green as “good,” in this case the stoplight means “It would be good to cast EF” while the text auras are telling me about the status the existing EF (“Your current EF is good, don’t mess with it”). I suspect most people will only use one or the other anyway.

There’s also a text indicator that shows exactly how much better the new EF will be. It’s literally just showing you (new EF value)/(old EF value) as a percentage, so if it reads 115% (as it does in the image above) it means recasting EF at this point will be a 15% increase in healing throughput. Note that this percentage can get very large in cases where you have a one-HP Eternal Flame active and you’re sitting on 5 stacks of Bastion of Glory.

Note that this indicator takes into account everything, as far as I know. It should accurately reflect changes in mastery, haste, Bastion of Glory stacks, spellpower, holy power, crit, and even Avenging Wrath. The only thing I’ve omitted are constant factors like the 50% increase from self-casting (which you should always have) and the 5% Seal of Insight healing bonus (which you should probably also always have, since I don’t think many players are switching to Seal of Truth/Righteousness).

The video below shows the indicator during development at 4x the final size to make it easier to see how it works:

It didn’t occur to me to write a Sacred Shield stoplight until writing this post, but I’ll probably put one together in the next few weeks. I’ll toss it on pastebin, update the WeakAuras page with a link to it, and probably tweet about it, but probably won’t give it it’s own blog post.
Jan 15 2014 edit: SS stoplight added, bundled with the EF stoplight. Also added crit scaling to the EF stoplight.

Aura Group Re-organization

Finally, I’ve had to re-organize the aura groups a little bit. Adding the code to support the EF Stoplight aura caused one of the other groups to get too large for Ace Serializer, which in turn broke importing. So I had to split them up. They’re now organized a little differently. The three big groups haven’t changed:

Those aura groups are all independent and work perfectly well all by themselves. You can import any combination of those and they should work seamlessly.

All of the auras that give you specific information about Vengeance, Sacred Shield, and Eternal Flame now have a dependency: the “Vengeance/SS/EF Helper Auras” group. This group contains the code that saves a snapshot of your stats when you cast EF or SS, which is why it’s required for the other aura sets to work. They all perform calculations on that information to determine what to display, so without it, they don’t work.

Vengeance/SS/EF Helper Auras (required for the three sets below)

And finally, the auras that display EF and SS information; again, none of these work without the aura group linked above:

(Technically speaking, the simple Vengeance text indicator doesn’t require the helper auras, so if you’re a non-paladin tank class and want that, you can just grab the “Vengeance/SS/EF text indicators” group and delete everything with EF/SS in the title and that Vengeance indicator should still work).

Final Product

And after all of that, here’s how it looks in practice on a target dummy:

If you want to see the Vengeance Bar indicators at work, check out the old 5.4 video on the WeakAuras Strings page. And as always, that page contains the auras for all of the other classes I play or have played. If you have a question about what addon I’m using to create a certain UI element, check out the UI Construction and Key Bindings post, which should still be mostly accurate. If that doesn’t answer your question, feel free to ask in the comments.

Other Stuff

A handful of quick comments regarding Simulationcraft stuff before I go:

• There’s a bug with Execution Sentence in Simcraft at the moment (in 542-1 at least, and several earlier builds). For some reason it’s not ramping up the damage of each tick appropriately for protection, even though it works perfectly for retribution. This is a non-trivial issue caused by some piece of code deep in the core, and at this point I couldn’t even give you a completely satisfactory answer for why it happens. But top men (meaning: people more competent than I) are on it. Top men, I say. Hopefully should be fixed for next build.
• There’s also a small bug in the last couple revisions that causes Eternal Flame to be slightly undervalued. I put some code in there to handle the hotfix applied in September (when EF’s self-healing bonus was nerfed from 100% to 50%) and forgot to take it out once the spell database was updated to include that information. So EF was only getting a 25% bonus from being self-cast rather than a 50% bonus for a while, thus being undervalued by about 17%. Oops. Bad Theck. This will be fixed in version 542-2.
• I’m almost done with the MATLAB automation code that runs the rotation simulation. This turned out to be a much larger and more annoying project than I expected (and I already expected it to be large and annoying). It’s arguably the most complicated of all of the sims because I had to allow the possibility of using different glyphs and talents for each rotation. Luckily, I should be able to finish it this week and have it ready for a blog post next week. Also luckily, the rest of the automation sims should be far easier to code than this one was.
• Once those are done, I have the fun job of deciding how to make them translatable to other classes (if at all). I have to see if any of this code runs in Octave or FreeMat (unlikely, I use a lot of fancy structure and cell stuff), and if not, decide whether to translate all of this to another language so that other theorycrafters/players can contribute and use the code. I could also entertain the possibility of integrating some of these features into SimC itself in the long run (ex: a talent simulation would be pretty simple, I think), but that’s something I’ve yet to discuss with the other SimC devs.

That’s it for today.

## Itemization Value of 4T16

Last week, Fouton from Icy Veins asked me whether I had tried to determine an “ilvl value” for the tier 16 4-piece set bonus. Stated another way, the problem he was trying to solve was twofold:

1. Is it worth using lower-ilvl tier pieces instead of non-set pieces just for the 4-set bonus?
2. If so, how much lower? Is it worth using LFR tier instead of heroic warforged non-set?

Unfortunately, I didn’t have an answer for him. I knew the 4-piece was powerful, of course. There was no question that using tier pieces over warforged loot from the same difficulty level was a survivability gain. But I had never really looked into whether it would make sense to use much lower-ilvl tier instead of warforged gear.

I was unconsciously assuming that if you had access to warforged loot from e.g. heroic, then you also had access to off-set gear from that same difficulty mode, so the most you would care about is a 6-ilvl difference. But especially for guilds progressing through normal, or guilds at the mercy of the personal loot system of LFR/Flex modes, that’s not necessarily a good assumption. Surely there are cases where a player has an LFR tier chest or helm and warforged normal-mode off-set from a different boss, and wants to know what to wear?

So I threw together a few quick profiles in Simulationcraft to test this.

Setup

As a control group, we’ll just use the T16 normal-mode protection paladin profile. This uses four pieces of normal-mode T16 (head, shoulders, chest, and gloves) with non-warforged Legplates of Unthinking Strife as the one off-set piece. Note that none of the gear in this profile has valor upgrades applied. The stat breakdown is given below:

T16N Stats
Stat Amount
Strength 19540
Stamina 47990
Expertise Rating 5107
Hit Rating 2607
Crit Rating 1112
Haste Rating 15677
Mastery Rating 7602
Armor 60112
Dodge Rating 180
Parry Rating 1526

The rest of the setup is pretty much what you’d expect. Talents are Eternal Flame, Unbreakable Spirit, Divine Purpose, and Light’s Hammer, glyphs are Focused Shield, Alabaster Shield, and Divine Protection.

I then worked up four different variant gear sets to compare. The first is a set where we downgrade two of our tier pieces to LFR level. We choose the chest and the shoulders for this, since the tier helm and gloves both have haste on them.  Since both chest and shoulders are expertise/mastery pieces with expertise reforged into haste, we lose a chunk of those secondary stats as well as some strength, stamina, and armor.

Since we don’t really want to deal with the hassle of reforging each gear set to cap expertise, we cheat a little bit by adding a shirt to the gear set that will put us over the cap. While this adds a little ambiguity to our results, it should be a larger boon to the non-set arrangements than the tier sets.

After doing all of that, our second gear set looks like this:

T16N-LFR Stats
Stat Amount Diff
Strength 18786 -754
Stamina 46506 -1484
Expertise Rating 6345 N/A
Hit Rating 2607 0
Crit Rating 1112 0
Haste Rating 15503 -174
Mastery Rating 7065 -537
Armor 59329 -783
Dodge Rating 180 0
Parry Rating 1526 0

For the next set, we replace the chest and shoulders with normal-mode off-set pieces. In each case we’ve gone for maximizing haste, so we’ve chosen Chestplate of Congealed Corrosion and Darkfallen Shoulderplates. In both cases we’ve used the warforged (ilvl 559) version and applied two valor upgrades for a net ilvl of 567. Since we’re using a hacked shirt with 2500 expertise on it, we’ve chosen not to reforge the shoulders and have used a crit->mastery reforge on the chest. This gives us the maximum bang for our buck since none of that extra itemization has to go into expertise.

The stats for that gear set look like this (note that “Diff” is still in reference to T16N):

T16N-WF Stats
Stat Amount Diff
Strength 20045 505
Stamina 48630 640
Expertise Rating 5896 N/A
Hit Rating 2607 0
Crit Rating 1827 715
Haste Rating 18053 2376
Mastery Rating 6821 -781
Armor 60551 439
Dodge Rating 180 0
Parry Rating 1526 0

The next set takes the previous one to the extreme and uses the heroic warforged versions of both chest and shoulders.

T16N-HWF Stats
Stat Amount Diff
Strength 20576 1036
Stamina 49676 1686
Expertise Rating 5896 N/A
Hit Rating 2607 0
Crit Rating 1928 816
Haste Rating 18420 2743
Mastery Rating 7044 -558
Armor 60958 846
Dodge Rating 180 0
Parry Rating 1526 0

In our final two gear sets, we go to the other extreme: what if we force the player to use four or all five LFR tier pieces, including the severely sub-optimal dodge/mastery legs? We’ll be kind and reforge the dodge on those legs to haste, and continue to compensate for expertise and hit caps by using a fake shirt.

T16N-4LFR Stats
Stat Amount Diff
Strength 18032 -1508
Stamina 45021 -2969
Expertise Rating 6182 N/A
Hit Rating 3992 N/A
Crit Rating 1112 0
Haste Rating 14971 -706
Mastery Rating 7065 -537
Armor 58686 -1426
Dodge Rating 180 0
Parry Rating 1353 -173
T16N-5LFR Stats
Stat Amount Diff
Strength 17680 -1860
Stamina 44406 -3584
Expertise Rating 6182 N/A
Hit Rating 3500 N/A
Crit Rating 1112
Haste Rating 13789 -1888
Mastery Rating 7262 -340
Armor 58294 -1818
Dodge Rating 821 641
Parry Rating 1353 -173

We take all six of these gear sets and run them through a 50k-iteration simulation against the T16N25 TMI boss. Anything not explicitly mentioned is identical to the defaults in the T16N profile.

Results

Here’s what we get out the other side:

And summarizing the important bits in table format:

Results
Gear TMI SotR Uptime DPS DTPS HPS
T16N 230.5 73.25% 380k 149834 149540
T16N-LFR 967.0 72.82% 377k 153704 153381
T16N-WF 4125.1 64.32% 386k 160163 159818
T16N-HWF 1705.0 64.90% 390k 157680 157362
T16N-4LFR 1627.3 71.93% 371k 156588 156240
T16N-5LFR 3457.7 70.25% 363k 157307 156937

It should be immediately apparent from the table that the T16N gear set performs the best for survivability. It has the lowest TMI by a large margin and the highest SotR uptime.

Using normal warforged off-set pieces (T16N-WF) may be a gain of 2376 haste, but you actually lose about 9% SotR uptime, which means losing the 4-piece is costing you over 10% SotR uptime all by itself. And of course, smoothness (as measured by TMI) suffers greatly; the TMI is about 20 times higher, which means the spikes are roughly 27% larger on average.

Upgrading those off-set pieces to heroic warforged (T16N-HWF) pieces cuts your losses somewhat, but still gives significantly worse results than the control set. It’s not a large increase in haste or SotR uptime over the normal warforged configuration, but the extra stamina drops the TMI to around 1700, still about 18% larger spikes than T16N.

The T16N-LFR gear set, on the other hand, outperforms both of the off-set configurations. The TMI is only about 4 times worse than T16N, corresponding to a 13% increase in spike size, but the SotR uptime isn’t that much lower. So there’s no question that using 2 pieces of LFR tier (chest and shoulders) to get the 4-piece bonus gives superior survivability to using two well-itemized heroic warforged items in those slots to get extra haste.

If you instead force the use of four or five LFR tier pieces, the situation gets worse. That’s a significant loss of haste and stamina, so the TMI is predictably much higher. 4LFR is roughly equivalent to the T16N-HWF set in TMI, making up for the significant stamina reduction with the higher SotR uptime of the 4-piece bonus. It’s solidly ahead of the T16-WF gear set in both categories.

5LFR is still better than the WF set that uses normal-mode warforged off-set, both in terms of TMI and SotR uptime. 5LFR gives higher SotR uptime than the HWF set, but it trails in TMI thanks to the extra stamina and secondary stats (~5k haste) of the heroic warforged gear. That said, I don’t think this situation will be very common – players that have access to heroic warforged off-set should rarely need to resort to LFR pieces to complete their tier set.

There are two other things I want to point out about this data. Note that the higher-ilvl sets also convey slightly higher DPS, which is something to consider. The difference isn’t large (less than 3%), but on a serious DPS check that might be worthwhile.

Also note that all of these results assume you’re using Eternal Flame and Divine Purpose. If you’re talenting Sacred Shield, then you can still game this effect with free Word of Glory casts to fish for more Divine Purpose procs, but the benefit will be reduced somewhat. And of course, if you’re not using Divine Purpose then the 4-piece bonus won’t help your SotR uptime at all, though it will still make you more survivable by virtue of removing the opportunity cost of having to heal yourself with Word of Glory.

Conclusions

I’m hesitant to assign an equivalent ilvl value to the 4-piece bonus for a few reasons. The first of which is something most people don’t think about: not all ilvls are created equal. The head, chest, and leg slots give you more stats per ilvl than the shoulder and glove slots do, so the exact ilvl value will depend on the particular slots in which you’re making the sacrifice. In addition, it will depend a bit on which off-set gear you have; we’ve only looked at two specific choices (shoulders/chest), so we’d get a different answer if we considered the head, glove, or leg slots.

However, it’s clear that under the right conditions the tier bonus is stronger than trading up 52 ilvls in two slots (the difference between T16N-LFR and T16N-HWF). We also know that it’s roughly equivalent to trading up 52 ilvls in shoulder/chest and gaining 25 ilvls in head/gloves (though in this case, with equivalent tier rather than off-set).

Beyond that we’d have to guess a little, or run more sims where we compare the tier sets to other sets that use only off-set pieces of much higher quality. That introduces the benefit of the 2-piece bonus as well, though that’s probably a relatively small effect. It’s clear that four heroic warforged off-set pieces would beat out four or more LFR tier pieces based on the data we already have. It seems unlikely that a set with four heroic warforged off-set would be able to compete with the T16N set though.

The take-home message here is that the 4-piece can be really, really strong if used properly, and it’s worth resisting the temptation of even significantly-higher-ilvl gear to keep it. In all but the most extreme cases, such as trading multiple LFR tier pieces for multiple heroic warforged off-set pieces, keeping the 4-piece is going to be the better call.

Again, that comes with some caveats: it assumes you’re talenting Eternal Flame and Divine Purpose. If you swap from Divine Purpose to Holy Avenger for an encounter, then the benefit is reduced (though not eliminated – it still makes EF easier to maintain); if you don’t use either DP or EF, then the benefit is smaller still, and depends on how often and effectively you use WoG as an emergency heal.

## A Letter to Celesty Claus

Every Winter Veil, children of both factions write letters to Greatfather Winter and ask for toys and games. In the meantime, their parents are writing letters and saying prayers to a completely different deity: Celesty Claus, the great celestial dragon that maintains the cosmic (class) balance. Legends say that he flies through the sky on Winter Veil Eve showering the world with nerfs and buffs, and the occasional meteor by accident (one of the inherent downsides of automated shooting star delivery systems).

A rare picture of the elusive Celesty Claus.

Classes that were good that year are happy to wake up on Winter Veil morning to find buffs in their stockings. However, classes that were bad check their stockings with trepidation, because they know they’re only likely to receive nerfs.

The rest of the year is usually spent bickering about who got the best loot from Celesty Claus and why everyone else needs to be nerfed because they’re clearly overpowered in PvP. And asking for ponies.

Of course, he never brings ponies, because he’s heartless. I don’t mean that in a derogatory way, but in a literal, anatomical way. Dude’s made of stars, he’s powered by fusion reactions, he doesn’t have a need for meat and sinew. How many ponies do you know that have survived the heat – not to mention radiation exposure – of a body made of stars? So wishing for a pony is pretty stupid unless you want char-broiled irradiated pony.

This is my letter to Celesty Claus for this year, specifically for protection paladins.

Dear Celesty Claus,

I know you’re a busy man… dragon… spectral titan construct… thing. So I’ll dispense with the milk and cookies and get right to the point. Which is asking you for stuff.

1. Please bring me a version of Holy Wrath that doesn’t have the damage-splitting effect. I get the original goal – a long time ago in a continent far away it was a neat way to give Retribution AoE damage that wasn’t “free” without adding another spell to their arsenal. But this is 2014, Retribution doesn’t even have Holy Wrath anymore. It’s ours now, and really should be designed around our needs.

And right now, we need snap aggro. We’re already strong on up to three targets thanks to Avenger’s Shield.  And our sustained aggro on large groups of mobs is also fine thanks to Consecrate. But the difficulty is picking up aggro on groups of 5+ mobs so that our sustained aggro can do its thing. On large groups, Holy Wrath hits weakly enough that it can’t compete with things like Dizzying Haze and Thunder Clap.

I realize that removing the damage splitting effect is a buff to Holy Wrath, and a buff to our sustained AoE DPS/aggro as well. I’m happy to accept a nerf to Consecration to balance out sustained DPS to make Holy Wrath a more useful spell.

2. Please bring us equitable talent choices on our level 45 tier. Eternal Flame is extremely strong even without our tier 16 four-piece set bonus. It really needs to be nerfed a little more in order for Sacred Shield to be a competitive option.

Likewise, Selfless Healer is in a pitiable state for protection. An instant Flash of Light, while nice, still costs a GCD, doesn’t heal for as much as a full-strength Word of Glory with 5 stacks of Bastion of Glory, and doesn’t come with the fringe benefits of Eternal Flame. Please give it some love so that somewhere, some protection paladin will feel like it’s worth taking.

If Selfless Healer could allow Flash of Light to be cast off of the GCD for protection, that would help a lot. But it also needs to heal for a lot more to make up for the fact that it doesn’t give you the long-term smoothness of Eternal Flame or Sacred Shield. Those two talents prevent spikes before they start by giving you predictable healing or absorption at regular intervals. For Selfless Healer to be able to compete with those two proactive talents, it has to be a very effective reactive choice.

It should really gain the full increase from Bastion of Glory so that the talent remains competitive as we stack mastery. Ideally, a Flash of Light cast with 3 stacks of Selfless Healer and 5 stacks of Bastion of Glory should heal for quite a bit more than a Word of Glory with 5 stacks of Bastion so that it’s your first go-to reactive tool. It’s trying to compete with two strong “over-time” effects, so it should condense the raw healing or absorption of those effects into a single huge shot. If it doesn’t heal for 80% of your health, it’s not really going to be competitive with an Eternal Flame that heals for 60%-70% and gives you several times your health over 30 seconds.

3. Please bring me a version of Consecrate that benefits from haste. As I’m sure you know, we paladins love haste, almost to the point of irrationality. And while Sanctity of Battle helpfully reduces Consecrate’s cooldown as we stack haste, it doesn’t change its tick interval. It ticks at fixed one-second intervals no matter what your haste level is.

The problem that arises here is that when we’re at high levels of haste, we can be in a position where we re-cast Consecrate before the previous one is done ticking. Since we can’t have two Consecrates on the ground, we end up clipping the earlier cast and losing ticks, reducing Consecrate’s damage per cast.

In a single-target situation, it’s fine for Consecrate to be our lower-DPS filler that remains a low priority. However, in AoE situations it is a much higher priority. That reduces the effect of haste on our many-target sustained threat. It’s almost like the spell suffers from diminishing returns with respect to haste.

More importantly, it makes it trickier to use properly for novice paladins. You lose DPS if you recast it early, but you lose more DPS by bumping it lower in priority during AoE situations. The default unit frame even shows a little timer for it, which could be misleading for a novice. I’d just like to see it work more seamlessly with Sanctity of Battle so that it feels less awkward.

4. Please make seals interesting again. I remember losing auras. It was sad to lose something iconic, but at the same time they had devolved into a “set it and forget it” mechanic that didn’t add a lot of fun game play. If it’s something I won’t change for hours and has a minimal effect on my experience, it’s probably not worth keeping.

The problem is that seals feel very much the same way as protection. Seal of Truth has been neutered to the point that the DPS increase is negligible. Seal of Righteousness is similarly weak. And most importantly, Seal of Insight is such a strong survivability component that it is almost never worth giving up for either of the other two. You could remove Seal of Truth and Seal of Righteousness from protection and most tankadins wouldn’t even notice.

But the idea behind the Warlords of Draenor talent “Seal of Faith” is interesting. We would trade a bunch of damage output for healing output. Of course, it doesn’t make a whole lot of sense right now, because we don’t have the supporting tools to make that useful. But if we had a more extensive toolkit of healing spells, I could imagine using that talent to help my raid survive heavy raid damage phases.

I don’t think I’ll ever take that talent, because having Holy Shield back is just too cool, but it’s the thought that counts.

And in this case, I’d love to see all seals work on this basic principle of having a more significant effect on your play. Seal of Insight could be the default “tanking” seal that gives you a big chunk of survivability by increasing armor and healing throughput. Seals of Truth and Righteousness could sacrifice a lot of that self-healing to grant other benefits that are primarily useful while not tanking, much like Seal of Faith sacrifices damage for more (possibly) raid-healing or off-healing capability.

The one fear is that by being able to swap between highly disparate modes could cause tank imbalances. We’ve seen this before, where one tank was able to switch from high damage output to high survivability by toggling stances, and it caused plenty of problems. It’s really something that all tanks need to be able to do in similar capacities to be balanced.

But the alternative is to just redesign or eliminate seals. Seal twisting just isn’t very fun for the same reason most retribution paladins dislike Inquisition.  Spending resources now for a zero-damage GCD feels bad, even if the math says it’s an overall DPS increase. And for protection, the damage increase is rarely, if ever, worth the large survivability sacrifice of dropping Seal of Insight. If seals aren’t getting redesigned, I’d rather just see each spec get one seal: Seal of Insight for protection and holy, Seal of Truth for retribution.

If you really want to go a little radical with the redesign option, give us one “passive” seal and make the others active abilities that operate like cooldowns. Seal of Righteousness could replace the active seal, granting its usual effect for 15-20 seconds on a one-minute cooldown, and then automatically swap back to the “default” seal after the effect has ended. That would give us the ability to actually use Seal of Righteousness for a temporary AoE damage boost without costing us two GCDs.

5. Please bring us an end to the raid cooldown arms race. While it’s nice to be able to contribute something to the raid group, the sheer number of raid cooldowns being tossed around is getting absurd. Many encounters are being designed around rotating raid cooldowns to survive. While there’s certainly some level of coordination involved in that, I think it makes the game less fun for healers. It also leads to class stacking on encounters where those cooldowns are not equitable.

I feel that raid cooldowns should be limited to one role, and that role should probably be healers. In 20-man mythic, the number of healers should be more stable than it is in current 25-man heroics. While the number of tanks will be stable as well, the temptation to sacrifice a little DPS for another raid cooldown would be strong.  Sacrificing an entire player worth of DPS for a raid cooldown is much more punishing and also more strategic, since you would do that on a fight where you presumably want more healing to begin with.

Raid cooldowns should, in my mind, be a finite resource that you have to use intelligently and carefully. Precision tools to deal with only the most difficult situations. Rotating cooldowns to trivialize an entire 30-60 second period of an encounter just feels cheesy to me, as is having enough Devotion Auras to throw at every single instance of a boss’s raid-wide damage ability.

Yes, I will be sad to give up Devotion Aura. But I will be happier with raiding as a whole, so it’s a sacrifice I’m willing to make.

Sincerely,
Theck

P.S. Sorry about killing your cousin Elegon every week for the past six months or so. He was… um… corrupted by the Mogu or something, so it was justified. Each week. Really. On the bright side, he dropped this great mount that looks just like you!

Happy Holidays!

Happy Holidays from everyone here at Sacred Duty. See you next year!

## MATLAB Automation Code

So it’s finally time to unveil a project I’ve been working on intermittently for the last three months.  If you recall, in the past I maintained a suite of MATLAB DPS simulations that attempted to determine optimal rotations, stats, glyphs, talents, weapons, and so on.  When the Finite State Machine (FSM) rotation modeler screeched to a grinding halt due to a combination of haste effects and long cooldowns, that suite of simulations was put on hold while I searched for a new (and faster) way to run simulations.

And as I’ve mentioned before, Simulationcraft was the solution I settled on.  It had the speed and accuracy I needed, and I wouldn’t have to do all the work myself thanks to the extensive set of contributors.  It also held the promise of unifying my DPS and survivability simulations into one simulation package rather than maintaining separate code for each.  And it even has built-in stat weight generation, so I wouldn’t need to replicate that functionality of the old sims.

I spent a good chunk of the summer and early fall getting Simcraft’s paladin module up to date and inventing and programming a new tanking metric.  But that was only the first step.  I now had the simulation back-end to do the heavy lifting, but I still needed some way to do batch processing.  To re-create the glyph simulation, for example, I needed code that would run Simulationcraft over and over for a bunch of glyph configurations and analyze the results.

There are a number of ways I could go about doing that.  I could write simple batch files in DOS, or more realistically in another language like Perl (which I’d then have to learn, since I don’t actually know Perl).  But what I really want to put together are giant tables of data and graphs.  And there’s one language I know that’s exceptionally good at handling giant tables of data and graphs.  MATLAB.

There was a bunch of grunt work involved, like writing functions that handled all sorts of mundane tasks: writing and reading strings to and from text files, regular expressions to pull the data I wanted out of Simcraft’s text output files, code to do simple tasks like making sure the path information was correct, code to automatically handle the caching and regeneration of results when I update to a new version of Simcraft, and code to output data into a table format that I can copy/paste directly into the blog.  None of that was very interesting, even though it was probably 90% of the work involved.

The interesting part, which is what I’m going to write about today, is how the code works and the output it produces.

Automating Simcraft

Simcraft operates by reading in an input file containing all of the relevant information about a character.  What we want to do is basically tweak portions of that input file to see what changes in the result.

There are a few different ways we could go about that.  For example, to test glyphs we could just have a “default_glyph_simulation_character.simc” file and edit that file over and over to change the glyph setup.  That wouldn’t be terribly hard to code, but it has a few downsides.  The main one is that it can be an issue for caching, which I’ll discuss a little later.

Instead, I went with a very versatile setup where I modularize the input file.   In other words, I split the .simc file into component parts: a player section, a glyphs section, a talent section, a gear section, a rotation section, and a boss section.  To run a sim, I just stitch the component parts together.  For example, I can combine the default player, talent, gear, rotation, and boss sections with various different glyph sections to create my different glyph setups.  I can save each of these combinations as a new .simc file, and by labeling them appropriately I’ll have the input file for each individual simulation.

This is useful for debugging, of course, but it also means that if we want to write a different comparison, we can reuse a lot of code.  Instead of swapping in and out the glyph component, we might swap in and out the talent component to create a talent sim, using most of the same automation logic.

It’s also helpful for caching the results.  One of the downsides of running simulations is that it can take a lot of time.  So it’s helpful to keep and reuse results that shouldn’t have changed.  If I run a sim with 50k iterations, I want to store the results so I can just call up those results later on rather than having to re-run the whole 50k iteration sim again. This essentially replaces a several-minute simulation run with a millisecond data read operation.

But you have to be careful about how you do that.  For example, if the input .simc file changes, we’d obviously want to re-run the full 50k iteration sim to generate new results rather than call up old results that may or may not have any relevance to the current problem.  By keeping a separate .simc file for every individual sim in a comparison, I can do that sort of checking and easily call up saved results when they should still be relevant. And of course, it will re-run the sim if it looks like anything important has changed (i.e. any of the inputs or the Simulationcraft executable are newer than the output).

So, to illustrate how this all works, let’s assume we’re writing a glyph simulation.  We start by defining defaults for the components we won’t vary.  In other words, we start with a default player:

paladin="Paladin_Protection_T16H"
level=90
race=blood_elf
role=tank
position=front
professions=Blacksmithing=600/Enchanting=600
spec=protection

and default talents:

talents=312232

and default gear

#T16N Gear Set

neck=juggernauts_ignition_keys,id=103916,reforge=hit_haste
shoulders=shoulderguards_of_winged_triumph,id=99130,gems=320haste_320haste,enchant=180sta_80dodge,reforge=exp_haste
chest=chestguard_of_winged_triumph,id=99126,gems=160exp_160haste_160exp_160haste_160exp_160haste_270sta,enchant=300sta,reforge=exp_haste
wrists=bubbleburst_bracers,id=103738,enchant=170mastery,reforge=hit_mastery
hands=handguards_of_winged_triumph,id=99127,gems=320haste_320haste,enchant=170haste
waist=demolishers_reinforced_belt,id=103788,gems=320haste_320haste_320haste
legs=legplates_of_unthinking_strife,id=104311,gems=320haste_320haste_320haste,enchant=250sta_100dodge,reforge=mastery_hit
feet=wolfrider_spurs,id=103880,gems=320haste_60crit,enchant=175haste,reforge=crit_hit
finger1=asgorathian_blood_seal,id=103794,gems=160exp_160haste_60haste
finger2=seal_of_the_forgotten_kings,id=103796,gems=160exp_160haste,reforge=crit_mastery
trinket1=vial_of_living_corruption,id=102306
trinket2=thoks_tail_tip,id=102305
main_hand=siegecrafters_forge_hammer,id=103969,gems=320haste,enchant=windsong,reforge=mastery_hit
off_hand=bulwark_of_the_fallen_general,id=103872,gems=320haste,enchant=170parry,reforge=exp_haste

# Gear Summary
# gear_strength=19365
# gear_stamina=36396
# gear_expertise_rating=5107
# gear_hit_rating=2607
# gear_crit_rating=1112
# gear_haste_rating=15677
# gear_mastery_rating=7602
# gear_armor=60112
# gear_dodge_rating=180
# gear_parry_rating=1526
# meta_gem=indomitable_primal
# tier16_2pc_tank=1
# tier16_4pc_tank=1
# main_hand=siegecrafters_forge_hammer,weapon=mace_2.60speed_10257min_19051max,enchant=windsong

and a default action priority list:

actions.precombat=flask,type=earth
actions.precombat+=/food,type=chun_tian_spring_rolls
actions.precombat+=/blessing_of_kings,if=(!aura.str_agi_int.up)&(aura.mastery.up)
actions.precombat+=/blessing_of_might,if=!aura.mastery.up
actions.precombat+=/seal_of_insight
actions.precombat+=/sacred_shield,if=talent.sacred_shield.enabled
# Snapshot raid buffed stats before combat begins and pre-potting is done.
actions.precombat+=/snapshot_stats

actions=/auto_attack
actions+=/blood_fury
actions+=/berserking
actions+=/arcane_torrent
actions+=/avenging_wrath
actions+=/holy_avenger,if=talent.holy_avenger.enabled
actions+=/divine_protection
actions+=/guardian_of_ancient_kings
actions+=/eternal_flame,if=talent.eternal_flame.enabled&(buff.eternal_flame.remains<2&buff.bastion_of_glory.react>2&(holy_power>=3|buff.divine_purpose.react))
actions+=/shield_of_the_righteous,if=holy_power>=5|buff.divine_purpose.react|incoming_damage_1500ms>=health.max*0.3
actions+=/judgment,if=talent.sanctified_wrath.enabled&buff.avenging_wrath.react
actions+=/judgment
actions+=/avengers_shield
actions+=/sacred_shield,if=talent.sacred_shield.enabled&target.dot.sacred_shield.remains<5
actions+=/hammer_of_wrath
actions+=/execution_sentence,if=talent.execution_sentence.enabled
actions+=/lights_hammer,if=talent.lights_hammer.enabled
actions+=/holy_prism,if=talent.holy_prism.enabled
actions+=/holy_wrath
actions+=/consecration,if=target.debuff.flying.down&!ticking
actions+=/sacred_shield,if=talent.sacred_shield.enabled

We then come up with a list of all of the different glyph combinations we’re interested in and create .simc component files for those as well.  For example, there’s an “AS_AW_DA.simc” file that just contains:

glyphs=alabaster_shield/avenging_wrath/devotion_aura

and similar files for every other combination we care about.  We then piece together a complete .simc file from the default components and one of the glyph components, and run that sim to get our .html and .txt output files.  And then we do it again for a different glyph component file, and then again, and so on until we have results for all of them.

The last part is just collecting and displaying the data by reading those output files, searching for the relevant information, and arranging it in data tables or graphs.  That’s mostly done by filtering the text output files with regular expressions, and isn’t all that interesting.  However, the results it spits out are interesting.

Glyph Comparison

Below is the data from the first run of the completed glyph comparison.  The defaults being used are all shown above except for the boss component, which is just the TMI standard T16N25 boss.  This is a list of every possible glyph combination using the following glyphs:

AS – Alabaster Shield
AW – Avenging Wrath
BH – Battle Healer
DA – Devotion Aura
DP – Divine Protection
FW – Final Wrath
FS – Focused Shield
HW – Harsh Words
IT – Immediate Truth
WoG – Word of Glory

There are a few omissions here. Some glyphs are basically useless for simulation (Holy Wrath, for example), so they’ve been ignored.  Double Jeopardy is missing because it’s not programmed properly in Simcraft at this point – something I hope to remedy during the holidays.  I should also note that the Harsh Words glyph doesn’t do anything in the default setup since Eternal Flame is the chosen talent.  I can fix that in a variety of ways, the easiest of which is probably just to add an APL entry to offensively cast WoG if the glyph is present.

But otherwise, that list should cover all of the major glyphs that affect DPS and survivability.  I’ve ignored minor glyphs since none of them have a significant impact.

Below is a sortable list of the data. Since it’s long, I’ve spoilered it so you can open and close it.  While I haven’t included error metrics on the table, the maximum DPS error in this data set is 88, which is less than 0.005% error.  Note that “E” stands for an empty glyph slot.

Spoiler Inside SelectShow

Rather than dig through all of that data to come up with important conclusions, I’ve also programmed it to generate a table showing DPS for single-glyph configurations.  That table is shown below.  DPS error data is provided here, along with the DPS difference between that configuration and having no glyphs (“Delta”).  Delta is thus the DPS gain due to adding that glyph in isolation, to within +/- the error (“Err”).

Title
Glyph DPS Err Delta HPS DTPS TMI SotR
E 367429 78 0 157433 157771 588.1 73.0%
AS 372376 78 4947 157454 157791 594.9 73.0%
AW 367355 79 -74 157486 157823 836.9 73.0%
BH 367471 78 42 156043 156789 18810.1 73.0%
DA 367334 78 -95 157443 157779 650.6 73.0%
DP 367323 83 -106 149574 149868 213.6 73.0%
FW 370402 83 2973 157422 157756 737.7 73.0%
FS 382948 78 15519 157464 157802 683.5 73.0%
HW 367394 78 -35 157451 157787 574.5 73.0%
IT 367187 80 -242 157468 157805 546.6 73.0%
WoG 374192 78 6763 157464 157801 598.9 73.0%

This table basically shows us that Focused Shield is the largest DPS gain we can get against single targets by a large margin.  Coming in second place is the Glyph of Word of Glory, thanks to all the EF casts we’re using in this profile, followed by Alabaster Shield.  Final Wrath is a distant fourth, and pretty much nothing else has a significant effect on our DPS output.

Some of the deltas are a little bigger than the “Err” column even though they should have literally zero effect (ex: Immediate Truth given that we’re using Seal of Insight), which suggests that the error bounds SimC is reporting probably aren’t generous enough.  I don’t remember whether it’s reporting a 95% CI interval or something else, so I’ll probably have to dig through the statistics module and figure out what I need to do on my end to get more realistic error bounds.

Anyway, we can also make two other useful tables out of this data.  The first would be to sort it in order of descending DPS to get the top 10 DPS combinations.  We should expect that FS+WoG+AS is on top, followed by FS+WoG+FW.  And indeed if we ask MATLAB to generate that table, we find:

Top 10 DPS Combinations
G1 G2 G3 DPS Err %Err HPS DTPS TMI SotR
AS FS WoG 395388 85 0.00% 157420 157757 819.0 73.0%
FW FS WoG 393219 88 0.00% 157469 157805 566.7 73.0%
AS FW FS 391406 86 0.00% 157456 157796 571.9 73.0%
AW FS WoG 390205 85 0.00% 157464 157796 632.7 73.0%
FS IT WoG 390107 85 0.00% 157439 157773 572.1 73.0%
E FS WoG 390094 85 0.00% 157433 157771 536.3 73.0%
BH FS WoG 389977 86 0.00% 156020 156758 18379.6 73.0%
FS HW WoG 389911 85 0.00% 157495 157830 2258.5 73.0%
DA FS WoG 389881 85 0.00% 157418 157752 533.8 73.0%
DP FS WoG 389830 85 0.00% 149573 149868 384.7 73.0%

I wouldn’t trust the TMI results to better than +/-50% here because we’re clearly running into the “self-sufficiency” problem I discussed in an earlier blog post.  In other words, I doubt the difference between the top 6 DPS specs is at all significant, it’s probably just noise.  On the other hand, the significant jump we see when using Battle Healer is real.  I’m also not 100% sure what’s causing the higher TMI for the FS/HW/WoG combination – I’m guessing it’s a bug in how SimC handles Harsh Words and Eternal Flame (likely guess: it’s automatically casting WoG offensively when EF is cast, but still granting the player the HoT?).  Something to add to my holiday to-do list, I guess.

Finally, we could also make a “Best TMI combinations” list:

Lowest 10 TMI Combinations
G1 G2 G3 DPS HPS DTPS TMI Err %Err SotR
AS DP IT 372263 149585 149879 186.2 10.80 5.80% 73.0%
DP FS IT 382505 149600 149895 194.3 13.40 6.90% 73.0%
E AS DP 372551 149533 149828 195.9 18.90 9.60% 73.0%
AW DP WoG 374264 149555 149852 196.7 14.00 7.10% 73.0%
AW DA DP 367330 149520 149816 197.1 18.70 9.50% 73.0%
DA DP WoG 374264 149524 149820 199.3 22.30 11.20% 73.0%
DA DP IT 367326 149549 149845 201.3 16.50 8.20% 73.0%
E DA DP 367376 149505 149798 204.7 36.00 17.60% 73.0%
AW DP IT 367226 149577 149873 205.3 31.50 15.30% 73.0%
DA DP FW 370381 149562 149856 205.5 19.20 9.30% 73.0%

I’m not sure there’s much to learn from this particular table.  DP is really the only big survivability glyph we have since Devotion Aura isn’t on the default APL.  So this list is essentially “10 random configurations that include DP.”

For reference, all of the results of these simulations are hosted on the matlabadin project in the “trunk\simc\io\” folder.  So if you’re curious about any of the individual simulations, you can just look up the “glyph_X_Y_Z.html” file corresponding to that sim and see exactly what the setup and results were.

Talent Comparison

I’ve also written the talent comparison; it works basically the same way the glyph one does, but cycles through all the possible talent combinations.  I’ve only considered the ones that have an effect on combat (L45, L60, L75, L90).  The max DPS error on this table is 84, again less than 0.005%.

The default glyph configuration for these sims is

glyphs=focused_shield/alabaster_shield/divine_protection

though after looking at the results of the glyph comparison, maybe it should be FS/WoG/DP. Or FS/WoG/AS to try and cut down on the self-sufficiency problem, though that would also affect Unbreakable Spirit’s valuation significantly.

Abbreviations:
SH – Selfless Healer
EF – Eternal Flame
SS – Sacred Shield
PU – Hand of Purity
US – Unbreakable Spirit
CL – Clemency
HA – Holy Avenger
SW – Sanctified Wrath
DP – Divine Purpose
ES – Execution Sentence
LH – Light’s Hammer
HP – Holy Prism

Spoiler Inside SelectShow

In this case, rather than picking out “single-talent” combinations (since those really don’t exist), I’ve picked a handful of relevant ones for a shortlist.

Talent Short List
Talents L45 L60 L75 L90 DPS HPS DTPS TMI SotR
311212 SH US HA LH 390997 121134 153348 753032.4 71.0%
311222 SH US SW LH 388631 124118 161326 1080163.3 63.0%
311232 SH US DP LH 384737 122424 152032 500356.3 70.0%
312212 EF US HA LH 391302 153414 153716 423.1 71.0%
312222 EF US SW LH 388657 161300 161688 563.9 63.0%
312232 EF US DP LH 388086 149540 149832 216.3 73.0%
313212 SS US HA LH 382266 107749 109451 25367.7 69.0%
313222 SS US SW LH 379472 114575 117620 37985.6 62.0%
313232 SS US DP LH 376220 106990 108622 13449.7 69.0%

Here we see that Eternal Flame consistently beats Sacred Shield by a large margin for survivability (TMI in the hundreds vs. TMI in the tens of thousands).  The slight DPS gain of EF over SS is due to GCD clashes (remember, Glyph of Word of Glory isn’t chosen in the defaults).  Unbreakable Spirit is basically a no-brainer thanks to Divine Purpose, so there’s no reason to vary that.  Within a group, Divine Purpose consistently gives lower TMI than the other two L75 talents.  I stuck with Light’s Hammer across the board so that we could compare the L45 and L75 talents more directly, though I should probably add a few more combinations to this list so it highlights the difference in the L90 talents.  Luckily we get some of that from the next two tables: Top 10 DPS and Lowest 10 TMI.

Top 10 DPS Specs
Talents L45 L60 L75 L90 DPS Err %Err HPS DTPS TMI SotR
312313 EF CL HA HP 403360 76 0.00% 159434 159823 3014.5 71.0%
312113 EF PU HA HP 403339 76 0.00% 159400 159787 3775.0 71.0%
312213 EF US HA HP 403333 77 0.00% 153305 153621 520.2 71.0%
311113 SH PU HA HP 403248 76 0.00% 111387 160509 8468304.7 71.0%
311213 SH US HA HP 403211 76 0.00% 112976 153377 1180576.5 71.0%
311313 SH CL HA HP 403172 76 0.00% 111397 160498 8471353.9 71.0%
311323 SH CL SW HP 401415 69 0.00% 114830 168578 9243176.7 63.0%
312323 EF CL SW HP 401393 69 0.00% 167421 167950 3624.0 63.0%
311123 SH PU SW HP 401375 69 0.00% 114837 168569 9262060.2 63.0%
312223 EF US SW HP 401307 70 0.00% 161101 161520 772.7 63.0%

This list suggests that Holy Avenger and Holy Prism are the two dominant DPS talent choices this tier.  The L60 talent is irrelevant, and the L45 talents are similarly irrelevant in this table because of the lack of Glyph of Word of Glory.  Since nothing in the APL utilizes Selfless Healer yet, that’s basically an “empty” talent choice.  Combinations with EF and SH are interchangeable in regards to DPS because neither costs us any GCDs; SS combinations don’t show up because it does cost GCDs and pushes back DPS abilities.

From this list, it looks like in T16 normal gear, HA>SW>DP for DPS in the L75 slot.  This is a bit surprising, as I expected Divine Purpose to have a better showing here.  I haven’t quite figured out the rationale for why it’s performing so poorly for DPS in these sims.

Holy Prism rising to the top is also a bit of a surprise, but it makes sense.  We can cast three Holy Prisms every minute compared to a single Light’s Hammer or Execution Sentence.  Three Holy Prisms has always been more damage than either of those two alternatives, even early in the expansion.  However, those three Holy Prisms cost three GCDs.  When we were using Sacred Shield as our go-to L45 talent and getting Grand Crusader procs from attacking, we simply didn’t have those spare GCDs.  Switching to Eternal Flame and losing some Grand Crusader procs opened up enough GCDs that we can fit Holy Prism in very seamlessly.

I should also note that the default APL may not be properly optimized for Holy Prism yet.  That’s another thing we’ll have to refine once I have time to write the rotation comparison.  But that should only make Holy Prism better, not worse.

As far as the last two L90 talents, we can get a little bit of information from the next table.

Lowest 10 TMI Specs
Talents L45 L60 L75 L90 DPS HPS DTPS TMI Err %Err SotR
312232 EF US DP LH 388086 149540 149832 216.3 31.30 14.50% 73.0%
312231 EF US DP ES 388627 149833 150132 272.4 54.20 19.90% 73.0%
312233 EF US DP HP 399801 149415 149725 329.6 65.80 20.00% 73.0%
312212 EF US HA LH 391302 153414 153716 423.1 101.50 24.00% 71.0%
312211 EF US HA ES 391824 153606 153907 442.7 93.20 21.00% 70.0%
312213 EF US HA HP 403333 153305 153621 520.2 49.40 9.50% 71.0%
312222 EF US SW LH 388657 161300 161688 563.9 63.50 11.30% 63.0%
312221 EF US SW ES 388194 161475 161867 735.9 291.20 39.60% 63.0%
312223 EF US SW HP 401307 161101 161520 772.7 155.80 20.20% 63.0%
312332 EF CL DP LH 388118 155544 155885 914.6 84.60 9.30% 73.0%

Now, this is the table for lowest TMI, but the first thing I want to point out is about L90 talents and DPS.  The first three rows of this table are identical except for the L90 talent, and it’s clear from those rows that Holy Prism has a significant lead in DPS (around 11k DPS).  Execution Sentence comes in second and Light’s Hammer a close third, separated by only about 550 DPS.  I’m hesitant to put too much stock in the survivability value of the three talents given that both Execution Sentence and Holy Prism are always being cast offensively here (though Holy Prism still heals you via the secondary effect, obviously).  The TMI spread is also fairly small, so I hesitate to trust the order anyway.

However, turning our attention back to survivability, the dominance of EF+US+DP here is pretty clear, sweeping the top three spots.  Swapping DP for HA results in a small decrease in survivability, mostly through a loss of SotR uptime.  Swapping HA for SW is another clear loss, and losing US for Clemency in the last row is another clear loss.  Note that the default APL doesn’t use Hand of Purity, otherwise I suspect that EF+PU+DP+LH would have taken that last spot.  Add another thing to my holiday “to-do” list.

Realistically, I need to fix the “self-sufficiency” problem before I can rely on these TMI lists.  I may use some trickery to /cancelaura Vengeance periodically to try and reduce the problem.  We’ll see though – if the beta for Warlords of Draenor comes out any time soon, I may just start focusing on that since we’re basically done with content for this expansion anyway.

Summary

In short, I now have the ability to automate comparisons using SimC, much in the same way I used to do with my old MATLAB DPS simulations.  I’ve gotten the first two done (glyphs and talents), and they mostly confirm things we already knew.

The glyph simulation reinforces that Divine Protection is the only glyph that has a large survivability benefit (and again, that’s situational – on a fight with a big magical burst, you still wouldn’t use it).  It also confirmed that our best DPS glyphs are Glyph of Focused Shield, Glyph of Word of Glory, Glyph of the Alabaster Shield, and Glyph of Final Wrath, in that order.

The talent simulation reiterated that Eternal Flame is stronger than Sacred Shield for raw smoothness, and that Unbreakable Spirit is strong when you’re using Divine Protection on cooldown.  In the L75 talent category, it showed us that for DPS, Holy Avenger > Sanctified Wrath > Divine Purpose, but for survivability Divine Purpose > Holy Avenger > Sanctified Wrath.  And finally, in the L90 category it suggests that Holy Prism is significantly better than Execution Sentence or Light’s Hammer for DPS if you’re using Eternal Flame.  It didn’t tell us much about their survivability value, though.

My next task is probably to get more of the simulations online.  The  weapon simulation should be relatively easy, the rotation simulations less so (but arguably more interesting to write).  I also want to make some refinements to the settings for these two existing simulations based on some of the things I’ve noticed while writing this blog post, and probably based on things that other people notice and post in the comments.  I’m happy to entertain feedback on what I can improve here, since these sims are clearly still in a fairly rough stage of development.

In parting, I want to leave you with an interesting thought though.  Nothing about the code is all that paladin-specific. I’m bolting together .simc files that all contain paladin “stuff,” but the bolting together part is mostly class-agnostic.  Why is that important?

Well, consider: what if someone were to write default component .simc files for, say, a Frost mage? With a few minor tweaks, this automation suite could then run nearl identical simulations for a Frost mage.  Or a Protection warrior, or Blood DK, or… you get the idea.

## Post-Blizz-Con Wrap-Up, Part 2

In the last post, I ranted about time travel and lore. This time, I’m going to talk about some of the mechanical changes that were announced at BlizzCon.

Stats, Reforging, and Gear

There were a lot of different gear-related changes that I’m lumping together in this one category because they’re all somewhat related.  It’s hard to say which is the “biggest” or “most important” of these changes, because several of them are (literally) game-changing. So we’ll go through them in no particular order.

First, gear will no longer have a specific primary stat.  If a piece of plate drops, and you’re in Holy spec, it’ll have intellect and stamina as its primary stats.  If you switch to Retribution or Protection, the item will suddenly have strength and stamina as primaries.  This is a pretty huge change, because it basically makes the big three primary stats irrelevant on the bulk of gear.  Every piece of plate, leather, and mail will always have stamina and whatever primary stat your spec uses.  In some sense, it consolidates strength, agility, and intellect into one flexible primary stat.

I don’t think many players will argue that this is a bad thing. You’ll automatically have up-to-date gear for all of your off-specs, so hybrid classes aren’t punished as much for wanting to be fluent in more than one spec.  The gear may still not be ideal because necks, rings, cloaks, and trinkets will only have secondary stats, some of which are only relevant to certain specs. But it’ll be a large improvement over always using last-tier’s gear for your off-spec, especially since you’ll have current-tier set bonuses.

We’re also getting a few new secondary stats, with the major three being Amplify, Readiness, and Multistrike.  These should work much the same way the Siege of Orgrimmar trinket procs do.  Readiness is just the cooldown reduction effect we’ve seen on trinkets, and will apply to a select few abilities based on your spec.  Adding X% multistrike will give you two chances (X/2% each) to do an additional 30% damage (or healing) with each attack (or heal).  And X% amplify increases your crit and multistrike multipliers as well as giving you X% more haste, mastery, spirit, readiness, and armor from gear.

It’s worth noting that all three of these stats can be considered “tanking” stats.  Readiness gives you more frequent access to cooldowns like Guardian of the Ancient Kings, Divine Protection, and Ardent Defender.  Multistrike works on healing as well as damage, so while the details are still a little vague, it’s likely that it will work on effects like Seal of Insight and Eternal Flame.  Sacred Shield is a little dicier, but it could be made to work by simply having a chance to apply multiple absorb bubbles; it’s just not clear whether it will or not.  Amplify is obviously a tank stat because it gives you more of everything: haste, mastery, armor, readiness, as well as larger crits (for Eternal Flame) and larger multistrikes.

Armor is also making a return as a secondary stat on specific items (namely necks, rings, and other non-plate gear), so we’ll have another secondary stat to throw into the mix.  I didn’t lump armor in with the “major” three simply because armor isn’t really new.  It’s still nice to have it back though, armor was always a powerful stat even though it’s passive.

Having four new “tanking” secondary stats is good, because the other bombshell piece of news is that four secondary stats are being removed entirely.  Hit and expertise are gone, making juggling the hit and expertise caps a thing of the past. I predicted we’d see a change to these stats, but I didn’t anticipate both of them disappearing because it would reduce the number of possible stats on gear too much. But the addition of three new stats more than makes up for that.  Also note that while bosses will still have a chance to parry attacks from the front (so that melee DPS still have to stand behind them), tanks will have a passive that bypasses that effect. So as a nice little side effect, the “tank expertise” penalty is going away as well.

I didn’t expect dodge and parry to be completely removed for similar reasons, though I did expect a change. But again, given four new secondary stats to play with, we really won’t end up missing these two.  It’s worth noting that the dodge and parry mechanics aren’t going to be completely gone – we will still dodge and parry attacks passively, we just won’t have the ability to stack them via secondary stats.  It’s likely that we will still build up dodge and parry over the course of an expansion through our primary stats just like we do today.  So strength will essentially be our avoidance stat, and we won’t have to worry about choosing it since it comes on gear by default.

Of less concern to tanks, they’re changing the way that DoT snapshotting works.  In short, it won’t snapshot anymore, it will dynamically update the tick amounts based on your current stats.  This will mean that specs like Affliction Warlocks won’t be quite as skill-dependent, because your DPS won’t drop as much if you accidentally re-apply DoTs a little too early after buffs wear off.  That’s good and bad – good if you think the skill differential between an average Affliction Warlock and an expert one was too big, bad if you didn’t think it was large enough.  Since I don’t get enough time to play my Warlock enough to keep in practice anymore, it’s arguably a buff for me, so I’m not too worried. But I can see how some Warlock mains might be peeved.

Again, while it’s not of that much relevance to us, it’s worth discussing how the new mechanic will work.  The tentative model I overheard during BlizzCon discussions is that every DoT/HoT will have its usual fixed duration, and we’ll just get partial ticks at the end.  So for example, let’s consider Eternal Flame, a 30-second HoT that ticks in 3-second intervals. If we have 20% haste, those ticks will occur at 2.5-second intervals (3/1.20), so we’ll get 12 ticks instead of 10.  If we increase that to 25% haste, the ticks will be 2.4 seconds long (3/1.25=2.4), so the first 12 ticks will take 28.8 seconds.  Then we’ll get a partial tick at 30.0 seconds that will be half-strength (because it will be a 1.2-second long tick rather than a 2.4-second long tick, and 1.2/2.4=0.5).  Presumably Sacred Shield will work in a similar fashion.

With the changes to hit, expertise, dodge, and parry, they’ve also decided that reforging isn’t necessary, and have removed that in 6.0.  This sparked mixed reactions from the players I spoke with.  Sure, we don’t need it to maintain hit and expertise cap anymore, or to balance our dodge and parry ratings.  And the changes to DoT/HoT snapshotting will get rid of most (but maybe not all) relevant haste caps in the game.  But reforging still narrowed the gap between a well-itemized piece and a poorly-itemized piece of equal ilvl.  That has its advantages, especially when it comes to allocating loot in smaller raids.  I’m not sure reforging absolutely had to go in this environment.  But it seems the decision is that keeping reforging just isn’t worth the hassle when its impact is so marginal.  It’s not a decision I’ll argue against, since I don’t have strong feelings about reforging either way.

They also talked about having fewer gem slots on gear and paring down enchants to cover fewer slots, though with more options for each slot.  That means the level of customization we have on gear will be going down a little bit.  Whereas now, we can stuff every socket full of haste gems and use haste enchants to rack up an extra 8% haste or so, we probably won’t be able to do the same thing in Warlords of Draenor.

Tanking Mechanics

One of the most significant announcements is something that wasn’t actually said outright, but merely implied. You see, one of my predictions was that all tanks were going to move to a “DPS tanking” model similar to what Monks, Druids, and now Paladins use.  And while I don’t remember them explicitly addressing that topic (maybe they did in a panel that I’ve forgotten), they almost didn’t have to. The removal of dodge and parry from gear itself was enough to guarantee that such a transition was happening.  The fact that all of the new secondary stats have a clear impact on survivability as well as damage output just further reinforces it.  So we can expect to see big changes to Warrior and Death Knight mechanics once 6.0 goes into beta to embrace haste and crit as true survivability stats.

It’s not clear yet whether every stat will have to have a tanking impact.  For example, right now Paladins don’t benefit much from critical strike rating unless they take Eternal Flame, and even then the impact is fairly small.  It may be that crit rating will still be our dump stat in Warlords of Draenor.  But it wouldn’t take much to make it at least a contender.  If Seal of Insight were able to crit, that would give crit rating some baseline value.  We could also get a secondary mechanic to help bolster it – something like a small HoT effect when certain spells crit, for example.  We’ll just have to wait and see what Blizzard decides on that front, I guess.

Vengeance = VICTORY!

There is, however, no question as to my favorite change.  While it wasn’t announced outright during the convention (again, maybe it was during the Q&A and I just missed it, but it’s doubtful), it came out during discussions with developers at the after-parties.  While I got a chance to talk with a few devs in various degrees of detail at BlizzCon, I wasn’t the only one, so I don’t feel bad about sharing it.

-Vengeance changed to increase tanking abilties, rather than pure AP. He wants Tank DPS to be roughly 75% of a DPS’ output.

Is it tacky to declare victory? Because we’ve suggested exactly this solution several times before.

In all seriousness, this is a huge change for a number of reasons. Mel and I have been blogging about vengeance for a long, long, long, long time.  Many of the more blatant problems have been cleaned up by hotfixes along the way, but some of the core problems remain.  One of those is that our DPS as tanks depends sensitively on taking damage.  That makes our damage drop off during off-tanking periods unless we play awkward taunting games to keep Vengeance high, and more importantly it makes playing through solo content infuriating because we do so little damage.

When 80% of your damage comes from having a raid boss nearby, dailies become an infuriating exercise.  I no longer even think about doing dailies as prot, because for an entire expansion now, I’ve had to switch to Ret to be even remotely efficient with my time.  And as I’ve mentioned in earlier blog posts, the feeling of loss of control over your own DPS potential is somewhat demoralizing, because it takes control away from a role that is obsessed with having control in the first place.

This change reverses all of that.  If our default output is nearly 75% of a regular DPS class, we’ll actually be able to perform solo content in a sane amount of time.  The only concern I have is that we may be too strong in PvP situations, but maybe that’s intentional.  Players have been bemoaning the inability to PvP as a protection spec, so maybe this will bring that back.  And anyway, it’s not like balanced world PvP exists anymore.

I’m ecstatic about this change for another reason: I’ll finally be able to evaluate my performance easily with logging sites again.  It’s incredibly annoying to realize that you have absolutely no idea what DPS you should expect to be able to do on an encounter.  You can compare to other guilds’ logs, but there are so many variables involved that the comparison is nearly meaningless.  Your DPS swings drastically with a number of different factors, including your guild’s strategy and which of your tanks happens to be tanking first.

Un-linking Vengeance from DPS fixes a lot of that, which means I can finally make more useful comparisons between myself, my co-tank, and other tanks.  It will also make tank DPS balance a little easier to achieve on Blizzard’s end, because the range of AP values over which the five tanking classes need to be roughly equivalent just became a lot smaller.

Looking Forward

There’s really no connecting thread that links all of these different ideas, so it’s hard to come up with a conclusion for this post.  The best I can do is to say that there are a lot of different exciting and awesome changes coming in Warlords of Draenor, and you should be as excited about it as I am!

Even though I think the story is sort of hackneyed, the mechanics changes are great and foreshadow what will likely be the the best expansion for tanks yet. We’re getting many of the significant changes that we’ve asked for during MoP: a less frustrating and more functional version of Vengeance, consistency between the stats we want and the stats that show up on our gear, the removal of boring stats like dodge and parry, the elimination of the tank expertise penalty, and much more.

That’s not to say there aren’t changes we can still hope to see.  I plan on vigorously campaigning for Holy Wrath to lose/modify its meteor effect so that we once again have a functional many-target snap aggro tool.  And Meloree will tell you that the game still lacks a good mechanic to tie DPS to tanks, completing the DPS-Tank-Healer trinity.  That role used to be filled by threat, but I think that ship has long since sailed.  But it’s hard to look at the wealth of other quality of life and toolkit improvements we’re receiving and not be very pleased with the direction Warlords of Draenor is taking.

## Post-Blizz-Con Wrap-Up, Part 1

I had a great time at BlizzCon, and met a lot of great people.  It’s really fun to meet someone in person that you’ve only ever interacted with online, either via Twitter, in blog comments, or in-game.

Unfortunately, this year I didn’t take great notes on where I was and who I met every day, so I’m not going to try and quickly write up a recap.  While I have a lot of great stories, unless I sit down and carefully re-trace my steps I’m guaranteed to forget someone or something, and then people will just feel disappointed that I forgot them.  If I have time later on, maybe I’ll try and put together that sort of recap.  But probably not – the further we get from BlizzCon, the less I’ll remember and the less relevant it will be.  Plus, there are more than a few people who would probably prefer I didn’t share all of the stories I have from when they were drunk.  You know who you are.

So I’ll just say that I had a blast meeting everyone.  I had a lot of great conversations, met a lot of people I already knew and respected, and met a lot of people that I barely knew at all beforehand.  But I was very happy to meet all of you, whether you’re e-famous, or a theorycrafter, or a player that’s read the blog once or twice, or just an avid player that recognized my name.

Instead, this blog post is going to focus on the game-related news from BlizzCon, and my reactions to that information.  And today, we’re going to start with something that you probably never expected to see on Sacred Duty: A lore discussion!

No, not that type of lore!

Time Is On My Side…..

There are very few things I disliked about the new announcements.  For the most part, I’m on board with all of the changes they’re making in Warlords of Draenor.  But the story is one of those things that will never completely sit well with me.

Now, to be fair, I wasn’t that enthusiastic about Mists of Pandaria at first either.  For too many years, I’d assumed Pandaren were a joke race that would never see the light of day in WoW, and it was hard to break that mental stigma.  Plus, really, Pokemon in my WoW? And yet, here we are two years later, and they totally pulled it off. Mists has arguably been one of the stronger expansions, both in terms of gameplay and in terms of story and lore.  Even though I’ve never participated in a single pet battle, I know far more people who have done so and enjoyed it.  Even though I thought Pandaren were a joke, they turned out to be a well-fleshed-out race complete with culture and history that made them pretty bad-ass.  So it’s clear that first impressions can be misleading.

However, my objection to the Warlords of Draenor story is more fundamental.  Buckle up, it’s physics time.

You see, I absolutely hate time travel in video games.  As a physicist that specialized in things like quantum teleportation and superluminal pulse propagation, I’m intimately familiar with the important and fundamental role causality plays in physics.  And that’s made me fairly intolerant of any sort of “faster than light” or “time travel” suggestions, whether it’s in the media or in a game’s story.  When my own research was featured on slashdot, I was of course ecstatic, but still a little dismayed at the sensationalist presentation.  I have a self-created mental block on the entire idea of time travel.

So the “we’re going back in time” nature of the Caverns of Time has always sort of bothered me. While I love getting to see the stories from the earlier Warcraft games recreated in WoW, it’s always been a struggle for me to reconcile the time travel elements with the rest of WoW’s story.  And don’t even get me started on the whole Rhonin / Dragon Soul storyline – it was around that time that I threw my hands up in despair and completely gave up on WoW’s story ever making sense again.

Because frankly, time travel in games and movies has rarely been done well in my experience.  Generally, the amount of hand-waving that has to be done to justify time travel just creates new paradoxes that make the whole thing feel silly to me.  Once you allow the possibility of time travel, I feel like a narrative loses a lot of its motivation.  Striving to kill <current Big Bad> becomes much less suspenseful if you know that some intrepid time-traveler will just come fix it if you screw up.  Not to mention the inherent paradox there: if they’re coming back to fix it, shouldn’t they be here now? Why aren’t they?

We Are The Worlds

WoW’s take on time travel is a fairly standard one.  As far as I can tell, it mimics the “many worlds” interpretation of Quantum Mechanics.  In layman’s terms, every time a choice is made, the world splits into two or more different timelines, one for each possible outcome of the decision.  So for example, there’s a timeline where Garrosh dies at the end of Siege of Orgrimmar, and a timeline where he’s spared and imprisoned by Taran Zhu.

When we go back in time to the Black Morass or The Battle for Mt. Hyjal, we’re sticking within our own timeline and trying to prevent the Infinite Dragonflight from altering the events of our timeline.  Which again, raises the silly paradox of one-upsmanship: if we succeed, why don’t they just go back in time again to re-alter it?

On the other hand, in the End Time instance, we’re traveling to a future timeline where Deathwing wasn’t defeated. And in the Well of Eternity dungeon, we’re going back in time to take the Dragon Soul so that Thrall can use it in the future.  But if we take the Dragon Soul out of the past, doesn’t that change how the rest of history unfolds? Why do we return to a present that seems basically unchanged from how we left it?

The only rational explanation for this… well, ok, let me step back a moment. There is no rational explanation for this, because the whole pile of time travel nonsense is inherently irrational.  But the least irrational way to rationalize this is to assume that actions in one timeline don’t necessarily affect the others.  So maybe we’re going back to alternate timelines that are eerily similar, but not the same as, our “own” timeline.  So yeah, we stole the Dragon Soul from another timeline, which then caused all sorts of chaos in that timeline, probably resulting in the deaths of millions of people or something.  But it’s okay, because those people all sucked anyway, since they weren’t from our timeline.

I shouldn’t have to point out the multitude of problems with that interpretation.  What makes our timeline (or us) special? Is the Thrall in our timeline the same as the Thrall in other timelines? Am I the same person in this timeline that I am in another? How do I know which one is the “real” one, or are we all real?  And what does that say about individuality or free will if I’m just one of an infinite number of Theck clones in different timelines, all of whom have the audacity to jump into each others’ timelines and fuck around with them?

All aboard the Crazy Train, I guess!

The story of Warlords of Draenor, as it’s been explained to us, takes this concept to the next level by linking different timelines more strongly.  It’s depicted graphically below, though I can’t take any credit for the diagram; I stumbled across it in a post by Klaudandus on Maintankadin.  Garrosh manages to travel back to old Draenor and create a new split in the timeline.  The “Alpha” timeline on the diagram is the one we know and have played through, where Garrosh never went back in time.  The “Beta” timeline is the one in which Garrosh alters the events in that timeline to create the Iron Horde.  He then somehow opens a new portal that links the Beta timeline to a point in the future of the Alpha timeline.

Mock-up of Warlords of Destruction’s time-travel story. Note that I didn’t come up with this graphic, I got it from the forum post linked in the text. If you know/are the creator, please contact me so I can give you the appropriate credit!

The reason I say it “takes this to the next level” is that instead of allowing a few individuals hop around on the Timeline Superhighway, it’s essentially creating an on-ramp linking Interstate (“Intertime?”) Beta Draenor to Future Route Alpha Azeroth, so that anybody can hop back and forth between the two and cause chaos.  And instead of connecting two instants in time (whatever the hell an “instant in time” means in this hackneyed excuse for a coordinate system), it’s a continuous link, such that there’s a one-to-one correspondence between the time and date in Beta Draenor and the time and date in Alpha Azeroth, just with a pretty hefty offset.  I’m not exactly sure what that offset is – years? tens of years? hundreds of years? Does time even have any meaning if we’re going down this time-traveling rabbit hole?

What Did That Cat Ever Do To You?

So yeah, clearly I’m not a fan of the time traveling crap.  It’s just too problematic.  I know most people can just gloss over it and enjoy the ride, but I’m not one of those people. To me it just smacks of lazy storytelling in the same way that the many worlds interpretation smacks of lazy science.  Even calling it “science” is being generous, in fact. A large proportion of the scientific community (probably the majority of it) doesn’t consider the many worlds interpretation to be science at all, because it is inherently not falsifiable through any means we can conceive of.  In that sense, it’s no better than a religion, because we can’t test it. Scientists have long since accepted that there are other, slightly less insane ways to rationalize quantum mechanical behavior.  Though, for the reader’s amusement, I’ll note that to do so we say that we give up on “reality” to keep causality, which sounds even more off-the-wall.  But it actually does make sense once you rigorously define what we mean by “reality.”

In short, it’s the “Schrödinger’s cat” explanation you’ve probably heard about but never really understood.  You put a cat in a sealed box with some sort of random mechanism to kill the cat.  Traditionally it’s a vial of poison triggered by a chunk of nuclear material, but anything that is both random and fatal will work, so it could just as well be a pistol triggered by sunspots or a high-voltage arc triggered by seismic activity or a time traveler. Also, clearly physicists are awful human beings for doing this to a cat. Poor cat.

I’ve been waiting to use this in some context for *years*

But the point of this gruesome example is that once the box is sealed, we don’t know whether the cat is dead or alive.  We could assign a probability to each (i.e. the cat has a 50% chance of being alive), but we don’t know for sure until we open the box.  In quantum mechanical terms, until we open the box the cat is both dead and alive! Or more precisely, it’s in a “superposition state” where it is simultaneously dead and alive.  It’s only when we open the box that the cat “decides” which it is for sure.

Now that may seem ludicrous, and to be fair it is ludicrous for a cat for a few reasons.  But it’s not ludicrous for quantum-mechanical systems, which is what the principle really applies to.  A quantum-mechanical particle has a number of properties (spin, momentum, position, energy, etc.) that aren’t completely decided until it interacts with something in such a way that the property needs to have a fixed value.  For another analogy, assume the particle could be one of two colors: red or blue.  Until it interacts with another object in such a way that it’s color matters (for example, someone observing it), it isn’t one color or the other, it’s in a superposition state of being both red and blue.  Note that this isn’t the same as being purple!

This is what we mean by giving up “reality.”  We have this inherent notion that objects have fixed, well-defined properties – a tennis ball is yellow, my car is blue, and at any given point in time those two objects have a particular position and velocity.  We call that concept “reality” because we assume that each object has “real” properties – i.e. that the tennis ball really is a tennis ball and won’t suddenly become a baseball.  But on the quantum-mechanical level, some of that goes out the window.  If the ball could be a tennis ball or a baseball, it’s both until we make a measurement.  And when we measure it, we have a random chance of discovering that it’s either (i.e. it isn’t just that we don’t know which type of ball it is until we measure, it’s that it literally isn’t one or the other until we measure, at which point it randomly decides which one it is, as if it were flipping a coin).

The fun part of all of this is that giving up “reality” is a choice we make.  To be able to explain experimental results, specifically with regards to entanglement, we have to give up either reality or causality.  That means that, technically speaking, we could give up causality if we wanted to preserve the fixed nature of things.  Most scientists have decided that reality is the one we need to give up though.  There’s far more evidence that causality is preserved (both in quantum mechanics and other branches of physics) than the alternative, which is that our intuition based on macroscopic objects simply doesn’t apply at the quantum level.

Mother May I?

In addition to all of the paradoxes and concerns I’ve raised already, perhaps the biggest issue I have with time travel is the notion of free will.  If you assume that time travel exists, and that people can do so willy-nilly, then eventually you need to accept that the timeline you’ve experienced has already been altered in every conceivable way possible by every person that cared to interfere with it from all times in the future.  At which point, what’s the use in caring about anything? It’s hard to make it feel like your actions matter when there’s the ever-present threat of a time traveler erasing everything you did.  And if what you’re experiencing is already the result of those efforts, did you really have a say in how things turned out? Or are you just dancing to the tune of some time-traveling puppeteer?

Some of that can be explained away by making time travel difficult, expensive, or limiting it in some other way. If only a few select people can pull off such a feat, it’s a little more palatable.  Or so the reasoning goes, I guess? I don’t really buy it, because those arguments always assume that future technological advances will never reduce that cost. The concept of nearly-instant global communication between any two people would have been unfathomable to society even 100 years ago.  Yet today we have cell phones that let us do exactly that.  And when in the future, someone discovers the time-travel equivalent of cell phones, then what? Better yet, why haven’t they brought that technology back to us already?

I should also point out that the whole “limited” point of view falls sort of flat for WoW. We have a giant portal connecting two timelines now. And for years we’ve had countless adventurers skipping back and forth through the Caverns of Time as if they were on a day cruise to Lets-Take-A-Shit-All-Over-Continuity’s-ville.

Really, the only “good” variation of time travel I’ve ever seen in a video game is in the Assassin’s Creed series.  I’m sure that biologists and geneticists reel at the entire pseudo-scientific “genetic memory” concept that the game invokes.  But the mechanic works well for a lowly physicist like me.  By retrieving “memories” encoded deep in DNA, a modern-day protagonist can go back and re-live the experiences of one of their ancient ancestors through a virtual interface.  In other words, you’re playing a video game in which your character… plays a video game about their ancestors!

Yeah, it’s sort of like that.

Really though, the animus mechanic solves all of the major problems with time-travel in games, because it’s distinctly not time travel.  It puts strict causal constraints on the problem, because you can go back and re-live the environment and world, but you can’t do anything that alters the course of history.  While it’s more constricting from a story-telling point of view, it’s also a lot more sane.

Or like that.

Well, ok, mostly sane.  Just don’t take that guy’s word for it.

In Part 2, we’ll talk about some of the actual mechanics changes that were talked about at BlizzCon, and what I think about those.  Which is probably far more relevant given the nature of this blog.  Bet you never thought you’d see a 2000-word rant on game lore at Sacred Duty!  Maybe I should have saved this post for April Fools Day?

Posted in Humor, Theck's Pounding Headaches | Tagged , , , | 41 Comments

## BlizzCon Wishlist/Predictions

With BlizzCon coming later this week, we’ll soon be inundated with information about the next expansion.  Or at least, I think most of us assume that Friday’s “World of Warcraft: What’s Next?” presentation will reveal the next expansion.  I guess it’s possible that they’ll do something completely different, like announce that the entire WoW dev team is being transferred to project Titan and that WoW will go free-to-play with user-generated raid content.  But that seems pretty unlikely.

So in the spirit of being able to look back at old posts and say “told you so,” it seemed like an appropriate time to make some predictions about what we’ll be seeing on Friday. I’m not going to speculate on the name or theme of the expansion, because frankly, lore isn’t exactly my wheelhouse.  Instead, I want to focus on mechanics.

I’ve split each main idea up into two sections: a Prediction and a Wish List.  The prediction is the general thing I think we’ll see, whereas the wish list is more of a “this is what I’d do if I were trying to address the issue.”

Hit & Expertise Changes

Prediction – I think we’ll see a change to either hit or expertise to alleviate the awkwardness of reaching both caps, and especially to try and minimize the need to use an optimizer like AskMrRobot to reconfigure your entire gear set every time you get an upgrade. It’s also just sort of weird from a systems point of view.  The best description of the problem I’ve seen can be attributed to Hamlet of EJ/Druid fame. To summarize his point: there are two systems in the game (gemming and reforging) whose primary job is to ensure that players always hit their target. In other words, these two mechanics are both trying to subvert/fix a third mechanic, which is the hit cap.

Blizzard sources have been offering up comments about this problem all expansion, so we know it’s on their minds.  It’s not a huge stretch in logic to guess that they’ll try to fix it in 6.0.

I don’t think it’s likely that both hit and expertise will go away, though.  There’s something to be said for the planning/preparation aspect of having to dance around hit and expertise cap.  But it’s far more painful than it has to be due to the way itemization is allocated.  Each item can have an essentially random amount of hit or expertise rating.  Sure, it’s constrained by a formula, but only mildly, and you can have almost any amount of hit on an item by adjusting the other stats to compensate.  That’s the reason it’s a problem – it’s not a trivial problem to figure out exactly what needs to change on other gear when you replace an upgrade.  As anyone who’s worked on optimization problems in the past will tell you, once you have multiple parameters and several caps to consider, the math gets really ugly.

Wish List - My solution to this problem would be to make two significant changes.  The first is to remove the expertise-capping problem.  Rather than requiring 7.5% hit and 7.5% expertise, I would just require 15% hit for melee.  The first 7.5% removes misses, the second 7.5% removes dodges, just like expertise works currently with dodges and parries.  This removes the “two-cap” part of the problem and vastly simplifies the solution space.

I wouldn’t remove expertise as a stat, though.  I would convert it into another combat benefit.  Potentially something like The Secret World’s ‘Crit Power,’ which increases the damage you do with a critical strike.  Instead of crits automatically doing double damage, they could do 150% damage baseline, and each point of expertise increases that by 1% (i.e. to 151%).  They’ve already laid the groundwork for this sort of effect with the crit amplification trinkets in Siege of Orgrimmar, which might be an indication that they’re testing the waters for this idea.

Finally, I’d make one slightly more radical change to the way hit is itemized – I would quantize it.  In other words, let’s say that we need 10,000 hit rating to cap (choosing a nice round number).  Items would never contain a random amount of hit rating.  There would be nothing in the game that gives 2751 hit rating, like there could be currently.  Instead, hit rating only ever shows up in multiples of 1000.  A piece of armor could have 1000 hit rating, 2000 hit rating, 3000 hit rating, and so on.  The stats on the piece would still be bound by the itemization formula, so the second stat would just soak up the rest of the item budget.  In other words, a pair of hit/crit legs with 3000 hit would have more crit than a pair that had 4000 hit.

Similarly, gems would have even multiples, maybe 1000 each.  Getting a new upgrade would then generally mean you need to swap a few hit gems around, but the math would be very easy – just replace N gems, where N*1000 is the amount of hit you need.  Reforging could work as-is, though it would probably be easier if the amount of reforge could be increased to 50% just so that you could reforge hit to other things without getting odd amounts of hit.  While you could reforge a non-hit piece (i.e. crit/mastery) into hit and get an odd number, in practice this shouldn’t be terribly necessary since you could just gem for it.  There could always be edge cases where it’s slightly more optimal to do so, but for the most part they’ll be insignificant enough that all but the very hardcore could ignore them.

Tanking Mechanics

Prediction – I think that we’ll see Warriors and Death Knights get the full “DPS tank” treatment that Monks and Druids, and to a lesser extent Paladins, have gotten.  It’s clear from Mists that tanks love this new model, especially when paired with active mitigation.  They opened a Pandora’s Box when they gave Paladins Sanctity of Battle, and rather than close the box they decided to embrace the new paradigm, even putting haste on our gear.  The success they’ve had with Paladins, Monks, and Druids will lead them to drive the remaining two “old-school” tanking classes in the same direction, fully embracing DPS stats like crit and haste as true tanking stats.

Wish List – There are lots of ways they can pull this off, so it’s not worth trying to go into too much detail.  But it’s clear that Riposte is just a band-aid for tank DPS, not a true active mitigation tie-in.  I would probably do something like tie critical strike rating into crit block chance, which indirectly ties it into rage generation, and a similar crit->resource conversion for Death Knights (maybe triggering Death Runes?).  But the sky’s the limit here, just because the problem is so open-ended.  The end goal is clear though – make all tanks value crit/haste/mastery gear so that DPS fluctuations over a tier or expansion are less significant.

As for Paladins, I think we’re pretty well-off already.  At most I might give us a more direct tie-in with crit.  Right now, it has some value via Eternal Flame, but it’s not a large amount.  If Seal of Insight and Sacred Shield could also crit, then it might have a reasonable (if not substantial) value to a tankadin.

Which leads us into our next section….

Dodge & Parry Changes

Prediction – I think we’ll see Dodge and Parry reduced in significance for tanks.  Passive mitigation and avoidance have never been all that interesting or dynamic for the tank, and that’s no better exemplified by the envy our Warrior and Death Knight brethren have been communicating to us with dagger stares all expansion.  I think that next expansion will have a stronger emphasis on “active” avoidance triggers like Grand Crusader while also reducing the amount of Dodge and Parry showing up on gear.

Wish List – Those last two thoughts may seem inconsistent, but they’re really not.  To explain: I think that avoidance should show up less on gear than it does now (and more haste, mastery, and crit in its place), but the avoidance that does show up should also be more powerful.  One of the big problems with avoidance is that we have so much of it on gear that we’re already feeling stung by the diminishing returns curves.  If we had far less of it itemized on gear, each of those rating points could be more powerful, and thus make it a competitive stat.

How would I go about doing this? Well, for starters, I would nerf the Strength-to-Parry conversion rate (and similarly the Agility-to-Dodge rate for Druids and Monks).  Free avoidance from gear is what we’re trying to mitigate, so having that giant source of it is detrimental to our goal.

The other thing I’d do is make sure that dodge and parry were mutually exclusive on gear.  In other words, you would never get a dodge/parry combination item.  Parry and dodge would always be paired with something else – hit, expertise, crit, haste, or mastery.  This effectively halves the possible amount of avoidance on gear, which means each point can be almost twice as effective.  Thus, the rating conversion could be brought back down on-par with the other secondary stats.  That would let dodge and parry feel like important stats again, because you’d really feel the difference between a low-avoidance build and a high-avoidance build.

Item Squish

Prediction – It’s not really a huge surprise, since we know this is coming.  Hopefully we’ll learn more about the actual details at BlizzCon.  But in any event, I predict we’ll get that information on Friday.

Wish List – There’s a huge parameter space here, so it’s not really worth speculating on exact mathematical details.  For example, they could just slash the item level increases for everything pre-MoP so that we’re back to something in the 200 range.  Or they could just completely re-tune the formula used for itemization (though that has its own side-effects).  Or they could do some sort of time-warp-esque trick where stepping into a new expansion automatically reduced the ilvl of previous-expansion gear, such that next expansion starts at ilvl 100 again.  Or a million other possible solutions.

My guess, though, is that it’ll be something permanent rather than a phased effect.  In other words, each item will get a new item level that’s significantly lower than the current value, and the tuning of all content in the game will be squished according to the same formula so that DPS and healing expectations are more or less unaffected.

Everything Else

I have lots of other expectations too, but most of them are pretty banal.  For example, I don’t expect an overhaul of our basic mechanics (holy power, active mitigation, etc.) or talents.  We’ve gone through a lot of iteration over the past two expansions, but the last few patches have seen very minor adjustments in terms of core class mechanics, so I expect we’re pretty much where the devs want us by now.  I expect we’ll get some new glyphs, possibly a new spell at 95/100 (or whatever the new cap will be), but that’s pretty bog-standard by this point.

I suspect we’ll finally see new character models, which is interesting, but not really a mechanics thing.  I don’t really have any idea if we’ll get a new class or new races, but my guess would be that new races are more likely than a new class since Blizz seems to alternate.  If they do add new races, I hope they take that opportunity to re-balance racial bonuses to be less significant in combat.  And I’d guess that we’ll see some new feature that utilizes the level-scaling tech that’s been observed in bugged instances over the past month or so.

That’s all I have time for now, gotta pack for BlizzCon.  If you’re going, be sure to plan to attend Palapalooza! on Saturday at 4PM at the Meeting Stone, where you’ll be able to meet myself, Anafielle, and Meloree, our friends Rhidach and Antigen (from the Righteous Defense blog), and a host of paladins that are a lot more famous than we are (Treck, Slootbag, Absallom, and Towelliee, and Kerriodos have all said they’d be there)!

## Simulationcraft v540-5, Warcraft Logs, and BlizzCon

Sorry that I’ve been silent for a while, the last few weeks have been very busy.  Between family obligations and a number of other projects, I just haven’t had a lot of time to blog.  Also, I was pretty happy with that last blog post, and figured it could stand to have some more front page time.  But things are slowing down for me in real life, so it’s about time I got us caught up.

Simcraft

First of all, Simulationcraft version 540-5 has been released this week.  As always, you can get it at the download page.  It contains several notable improvements/changes for us

Changelog

• Updated TMI bosses
• Damages fine-tuned to be closer to Garrosh damage
• DoT damage fixed at 5% of raw melee damage
• T17Q boss added for fun / playing with full heroic profiles
• HTML reports re-written to show healing (HPS) and absorption (APS) independently for clarity.
• Charts still show HPS+APS like they did before

• Divine Protection bugfix (this was actually in 540-4, I just didn’t blog about it)
• Retribution APL tweaked for higher DPS, T16 profiles updated

To illustrate the reporting issue, here’s how it looks for a T16N profile that’s been tweaked to use Sacred Shield instead of Eternal Flame:

Simcraft 540-5 report showing absorbs.

The “Results, Spec, and Gear” section now breaks down HPS and APS independently in a more readable fashion.  So in this case, we generated 85892.2 healing per second, ignoring overhealing, and our Sacred Shield gave us 37299.5 absorption per second.  The title line reports this as total HPS+APS, which is 123192 HPS, followed by the absorption in parentheses to tell you how much of that total was due to absorbs.

Scaling Problems

I’ll also note that the results you see with these bosses are a little weird.  For example, if you sim the T16H profile against the T16H25 TMI boss, you get a health timeline that looks something like this:

T16H profile is self-sufficient against the T16H25 boss.

In other words, we’re generating more self-healing than the boss is dishing out in DPS.  This surprised me at first, but as far as I can tell the simulation is fairly accurate.  It turns out that Eternal Flame and Seal of Insight scale very well with attack power, and thus with Vengeance.  At a certain Vengeance level (reached around 10N T16 levels in appropriate gear) we simply become self-sufficient against this boss thanks to the sheer AP scaling of those two skills.

This means that at the top end, the TMI results will be fairly insensitive to small stat changes, and thus if you generate stat weights you might get garbage.  And unfortunately, cranking up the boss doesn’t always fix the problem – in many cases the sheer Vengeance increase is large enough that it doesn’t matter.

I haven’t found a good solution to this yet, unfortunately.  Part of the issue is that the simulation is very rigid – the boss always attacks you every 1.5 seconds, so your Vengeance timeline looks something like this:

Vengeance timeline for T16H profile against T16H25 boss.

And note that this isn’t the averaged version – this is for a one-iteration sim.  Vengeance flatlines after the ~20-30 second ramp-up period, which makes you a self-sufficient healing dynamo.  In reality, bosses look like this:

AP for a real encounter, in this case mine for 25H Malkorok

This is total AP, but obviously most of that is Vengeance.  I don’t start tanking on this boss (25H Malkorok), so my AP is pretty flat until I taunt.  then we have the slow ramp-up from ~200k to ~500k as I take damage.  Then the other tank taunts and my Vengeance begins to decay away.  And this cycle repeats itself over the course of the fight.

Simulationcraft has no concept of aggro, so we can’t do tank swaps (yet).  But in any event, it’s clear that at no point do we have a steady 500k Vengeance at our disposal.  Our average while tanking is probably closer to 300k-350k, so our survival tweaks should be aimed at optimizing around that value.  Which means we somehow have to artificially decrease the Vengeance we have in Simcraft.

My tentative plan at the moment is to introduce a damage source that doesn’t grant Vengeance, like a ground effect that your tank automatically stands in.  Then the TMI bosses would always include some amount of damage from this source to artificially reduce the Vengeance that boss grants.  I don’t think I’m going to bother doing this until the next expansion, which gives me some time to think about whether this is really the best implementation or not.  I’m open to alternatives if people want to suggest them.

Warcraft Logs

In case you were wondering where that sexy AP vs. time graph came from in the last section, it’s from Warcraft Logs.  This is a new competitor to World of Logs designed and run by ex-Tankadin (now Feral Druid) Kihra of Temerity.  I’ve been helping him bugtest it over the past month or two, though I really don’t deserve much credit since all I do is upload some logs and complain to him about things that are broken.  It’s just recently gone into its Friends & Family alpha phase, so the client is invite-only at this point.  But if you want to play around and see what sorts of things it can do, you can dig through my guild’s logs here: http://www.warcraftlogs.com/guilds/6/

There are a lot of features that are planned for this site that I’m not comfortable talking about without Kihra’s permission, but if they all pan out this will be the go-to site for combat log analysis.  I really can’t wait until it becomes publicly available. And of course, one of features I will incessantly nag Kihra to implement until he does so is automatic calculation of TMI for an encounter based on your health resource log.

Anyway, I encourage you to peruse the site and see what it can do.  Note that I’m not directly involved with the site at all, just a friend of Kihra’s who offered to help test the site.  So if you’re interested in participating in the beta, you’re probably better off watching @Kihra and @WarcraftLogs on twitter.

Hopefully I’ll have some time in the next month or two to do a full blog post on the site with a basic How-To guide and maybe highlight some of the more novel features.  We’ll see though, we’re only a week away from Blizzcon!

BlizzCon

Speaking of BlizzCon, I’ll be attending this year.  I haven’t figured out my complete schedule, but I do know I’ll be at the WoWInsider/wowhead party on Thursday night at the Anabella.  Beyond that nothings certain, but I’ll probably be at the Hilton parties Friday and Saturday night.  I probably won’t have a custom badge or anything, but if you look for a short guy in jeans and a polo shirt wearing a “Hi, My Name Is Theck” sticker, you might spot me.  If you want to meet up or say hi, watch my twitter – I’ll probably be tweeting where I’ll be during the con so that people can find me.