The Making of a Metric: Part 3

In our last installment, we nailed down the weight factors we’ll use for our smoothness metric.  Today we’re going to wrap it up by specifying normalization conditions for the histogram and formally defining the metric. To refresh your memory a bit (or if you’ve joined us mid-stream), to get to this point we’ve done the following:

1) Recorded a damage and healing (or “tank health change”) timeline during a simulation.  This is basically a list of every time you take damage or healing along with the timestamp at which that event occurred.

2) Calculated a moving sum of that timeline over 4 boss attacks, or equivalently over 6 seconds of real-time.  This gives us a new array representing all of the potential 4-attack damage spikes we could take, and is the source data we use for the smoothness analysis tables we’ve been using for the past 6 months or more.

3) Generated a histogram of that moving sum data.  Again, this is just like what we’ve presented in the smoothness analysis tables, just done graphically and with finer bins.

4) Developed a weight function that we can use with the histogram.  Multiplying the histogram by the weight function will preferentially value high-damage spikes and devalue weak spikes.

5) Roughly defined the metric as this multiplication of histogram and weight function.

Now we want to refine the metric by considering the appropriate normalization conditions.


There are a few reasons for normalizing the histogram before computing TMI.  The first and foremost is that it makes the number you get more consistent between different experimental setups. In the previous two posts, the data I provided was normalized only by player health (i.e., along the x-axis).  Everything else was left as-is for a 4-attack moving sum.  Note that I said sum, not average – I wasn’t even normalizing with respect to time.  That’s why we got numbers that looked like this:

|    Set |    TMI |
|   C/Ha |  18332 |
|   C/St |   7895 |
|   C/Sg |  16102 |
|  C/Shm |  22631 |
|   C/Ma |  41994 |
|   C/Av |  63949 |
|  C/Bal |  40468 |
|   C/HM |  23096 |
|     Ha |  49835 |
|  Avoid | 231586 |
| Av/Mas | 229068 |
| Mas/Av | 190308 |
|   Ha/h |  31126 |
|  Ha/he |  27795 |
|  C/Str |  66023 |

However, that was with 10k minutes of simulation.  If we had run for 20k minutes, they would be roughly double those values, and if we had run for 5k they would be half as large. It would be ideal if they all gave roughly the same ballpark TMI value within error since they’re all simming the same setup, just to different levels of precision.  So one additional variable we want to normalize with respect to is simulation length.

Similarly, if we decided to calculate TMI with a 5- or 6-attack moving average instead of a 4-attack moving average, it would be nice if the values came out relatively close.  As we’ll see, we can’t make them perfect, but we can get them in the right ballpark.  So that’s another variable we want to include in our normalization: the time window over which we perform our moving average.  In essence, this is really just saying that we want to perform a true moving average of the damage timeline rather than a moving sum.

The first of those two is very easy, so we’ll save it for later.  Let’s instead look at the time window normalization.  To illustrate the point a little more clearly, here are the histograms that you get if you perform a moving sum of the damage timeline from the repeatability data set I used in the last blog post for different numbers of attacks $N$ ranging from 2 to 7:

hn raw

Histogram after only health normalization. Each panel shows the histogram for a different number of attacks being considered.

It should be obvious that the distribution is shifting upwards roughly linearly, because we’re adding successively more attacks together in our moving sum.  If we were to apply the weight function at this point, we’d get weighed histograms that look like this:

hn weighted

Weighted histogram after only health normalization. Each panel shows the histogram for a different number of attacks being considered.

Of course this skews the TMI values you get pretty heavily.  This is what the TMI looks like for those different plots:

# attacks 2 3 4 5 6 7
TMI 468 2041 18071 109300 504459 4723791

Remember that this is the same data, just averaged differently.  Ideally we want this to be a little more stable.

The first step is the obvious one: use a moving average instead of a moving sum.  In other words, divide each moving sum by the appropriate $N$.  I’m going to add one wrinkle to that procedure: I’m also going to multiply by 4.  Why?  Because so far, we’ve been designing the metric around a 4-attack moving average, which nicely puts the bulk of the distribution’s value around the 100% of our health mark.  I wouldn’t need to do this, of course – I’m just multiplying by an arbitrary constant, so it won’t change the relative values of anything.  But it will make the plots look nicer and keep consistency with what we’ve done already.

So if we multiply each moving sum by $4/N$, we get unweighted histograms that look like this:

Raw histogram after health and time normalization. Each panel shows the histogram for a different number of attacks being considered.

Raw histogram after health and time normalization. Each panel shows the histogram for a different number of attacks being considered.

That looks a lot better.  The distributions all have the same mean value now (a little less than 0.5, or 50% player health), so that should fix up our TMI weightings, right?  Well, not quite.  Here’s what you get for TMI in this case:

# attacks 2 3 4 5 6 7
TMI 369918 28821 18071 10168 6349 5884

We now have the opposite problem: TMI is going down as $N$ goes up.  What’s going on here? The answer lies in the histogram plots above.  But as a hint, here are the associated weighted histograms.  See if you can figure out what’s wrong:

htn weighted

Weighted histogram after health and time normalization. Each panel shows the histogram for a different number of attacks being considered.

The problem we’re seeing is actually caused by two factors.  The first is that while the distribution may be centered at the same value, it’s not the same width.  A 7-attack moving average gives a much narrower distribution than a 2- or 3-attack moving average.  The second factor is our exponential weight function, which magnifies that difference.  Wider distributions include more values at higher percent-health values, which get weighted exponentially more.

If we wanted to model this exactly, we’d estimate the distribution as a Gaussian function of the form $e^{-a(x-1)^2}$ and then multiply by our weight function $w(x)=e^{10\ln(3)(x-1)}$.  Treating these as continuous functions and making the change of variables $y=x-1$, we get the following integral:

$$TMI \propto \int e^{-ay^2+by}dy$$

where I’ve used $b=10\ln(3)$ to make it simpler. By completing the square we can show that this expression evaluates to

$$TMI \propto \sqrt{\frac{\pi}{a}}e^{b^2/4a}$$

Now, here’s the important part.  The constant $a$ is related to the width of the distribution – it’s actually inversely proportional to the square of that width.  And since the width seems to be inversely proportional to $N$, that means $a$ is directly proportional to $N$.  Technically it’s proportional to some function of $N$, because we don’t know exactly how the two are related, but we can estimate it as a power-law effect.  So given that $a \propto N^{2k}$ and throwing away all unnecessary constants, we have:

$$ TMI \propto \frac{e^{~c/N^{2k}}}{N^k} $$

where $c$ is a constant determined by the exact composition of $b$ and $a$.  Thus, if we want to normalize our data properly, we’d want to multiply our current TMI metric by the inverse of this, namely $N^k e^{-c/N^{2k}}$.  We could try to fit our data to this form and get a value for $c$ and $k$ (and I did), but in practice that’s not so useful.  First, because our histogram isn’t really Gaussian to begin with, especially for lower $N$ values.  Second, because the histogram shape changes from gear set to gear set, so even if we could nail down $c$ and $k$ for one gear set it may differ for another.

Instead, I’m going to take a less accurate but simpler approach.  The point of this normalization was not to make the numbers uniform across all moving average lengths, just to bring them closer together.  So we’ll drop the exponential factor and just try multiplying by $N^k$ while calculating TMI.  Fooling around a bit, $k=2$ seemed to be fairly effective; here’s what we get if we do that:

# attacks 2 3 4 5 6 7
TMI 1479670 259389 289132 254188 228567 288322

Much better! Now they’re all within a moderately small range, from 250k to 290k, except for the 2-attack moving sum. The 2-attack moving average is too far gone to fix, to be honest.  That part of the curve is where the exponential factor we dropped makes a big difference, and it’s also the distribution that deviates most from Gaussian.  Since a 2-attack moving average isn’t something we ever worry about much anyway, it’s reasonable to exclude it as irrelevant and focus on making the 3- to 7-attack moving averages better.

There’s one more normalization step, which is the one I mentioned at the beginning: simulation length.  This one is easy though, because we just end up dividing by a constant value.  In this case, it’s the number of attacks we’ve received, which is 400k.  So we do that, which gives us:

# attacks 3 4 5 6 7
TMI 0.64847 0.72283 0.63547 0.57142 0.72081

Pretty nice!

There’s one final step I want to include though.  While it doesn’t make any difference in the results, I want to multiply by a constant factor of 10000.  Why?  Most people will have more trouble remembering and interpreting a decimal like 0.7208 than they will a rough estimate like 7000.  Keep in mind that we expect to see smaller TMI values, and it will get unwieldy to try and describe TMIs of 0.02 vs. 0.03 vs. 0.04 when we could just be talking about 200, 300, and 400.  It also gives a clearer impression of the amount of change, because going from 0.02 to 0.04 doesn’t seem like a big difference, but 200 to 400 does.

That gives us values that look like this:

# attacks 3 4 5 6 7
TMI 6485 7228 6355 5714 7208

We now have a complete definition of TMI. We’re not quite done yet though, as we can make a fairly significant simplification.

Cutting out the middle man

Up until now I’ve framed everything in terms of analyzing histograms because that’s what we do when we make our qualitative assessments.  But it’s not actually necessary for the numerical version – in fact, it decreases accuracy to use it in the process.

To illustrate why, here’s a simple example.  Let’s say we have the data set:

{ 2, 3, 4, 5, 6, 7, 8, 9 }

Let’s also assume that we use coarse bins for the histogram of this data, perhaps 3 units wide centered at 0, 3, 6, 9, 12, etc.  Our histogram would look like this:

0: 0
3: 3
6: 3
9: 2
12: 0

Now let’s say we perform the weighted average of the histogram, but for simplicity we use a flat weighting rather than an exponential one.  Averaging the histogram gives us:

$$ \frac{0*0 + 3*3 + 6*3 + 9*2 + 12*0}{3+3+2} = \frac{45}{8} = 5.625 $$

But if we just took the average of the source data, we’d get:

$$ \frac{2 + 3 + 4 + 5 + 6 + 7 + 8 + 9}{8} = \frac{44}{8} = 5.5 $$

Now of course, the result gets closer the more bins we use.  If we used a bin width of one centered at 0, 1, 2, 3, etc., we’d get identical results.  But at that point you’re not really accomplishing anything with the histogram at all, because every data point has its own bin.

The same is true in our case.  We have an array of moving average values, and while it’s convenient to bin them and show them as a histogram for plots, it’s not at all necessary for calculations.  Rather than calculating a weight factor based on the bin center and multiplying by the number of elements in the bin to get our weighted result, we could just calculate the weight function based on each data point itself and sum the result.

So we can completely cut out the histogram and go directly from the moving average array to the final TMI calculation.  With that simplification, we have the final process we’ll use to calculate TMI.  I’ll reiterate that entire process below so that we have it all in one place.

Formal definition of the Theck-Meloree Index

Note: While I’ve done everything so far in MATLAB, eventually we’ll want to do this all in Simcraft.  So even though I’ve used boss melee attacks as my default time window (i.e. a 4-attack moving average), we’re going to define the metric in terms of seconds here instead.

To calculate TMI from a damage (and healing) timline $D$ with time bins of width $dt$, we perform the following operations:

1) Calculate the 6-second equivalent moving average array of damage over $T$ seconds for the entire simulation length $\tau$ (also expressed in seconds).  We will use $T_0=6$ to represent this standardized window size.  This step can then be formally expressed for the $i^{th}$ element of the resulting moving average array as:

$$ MA_i = \frac{T_0}{T}\sum_{j=1}^{T / dt} D_{i+j-1} $$

which produces an array $MA$ of length $M=(\tau – T)/dt$.  It is also acceptable to use an apodized moving average that produces an array of length $M=\tau/dt$ Note that this step includes normalization for the time window we’re considering ($T_0/T$).

2)  Calculate the exponentially-weighted average of the moving average array as follows

$$ \Large {\rm TMI} = \frac{10000 T^2}{M} \sum_{i=1}^M e^{10\ln(h)(MA_i/PH-1)} $$

where $PH$ is the player’s health.  Note that this step normalizes for player health, fight duration (through $M$), and includes the normalization factor for moving average length ($T^2$).

And that’s it!  At some point in the future I’ll post a complete standardization reference that includes this information and more (including things like the standard boss settings), probably as a separate post.  But for now that should work for us.  Note that this is really a proto-definition; we’ve found numbers that work well in MATLAB with this particular standard boss, but when we implement this in Simcraft we may need to tweak the normalization factors slightly.  For example, I changed $N^2$ to $T^2$ when it should really be $(T/1.5)^2$, which adds a multiplicative factor of 2.25.  I’m not worrying about that level of detail just yet though, as we can just change 10000 to whatever we want to soak up those flat multiplicative variations.

The only thing I want to add at this point, because I’ll be using it in the next section, is a nomenclature detail.  The term TMI will properly refer to the metric as calculated using a 6-second (or 4-attack) moving average (i.e. $T=T_0 = 6$).  If we want to refer to the metric as calculated using a different $T$, we will make that clear by calling it TMI-T, such as TMI-9 for a number calculated using a 9-second moving average.  Note that it’s still normalized to $T_0=6$, just like we’ve done in the histogram figures above.  That also means that TMI-6 is the same thing as just saying TMI.

Comparing gear sets

Now that we’ve got the final form of TMI, let’s see how this works for the gear sets we investigated in Part 2.  Here’s the full TMI matrix for TMI-4.5 (3 attacks) through TMI-10.5 (7 attacks) for all of the gear sets

pct=100.00, N=200, vary hdf

|    Set | TMI-4.5|   TMI | TMI-7.5| TMI-9 | TMI-10.5|
|   C/Ha |   6510 |  7333 |   6448 |  5782 |    7308 |
|   C/St |   2579 |  3158 |   3186 |  3133 |    4032 |
|   C/Sg |   5595 |  6441 |   5956 |  5487 |    7027 |
|  C/Shm |  12329 |  9053 |   9317 |  8583 |    8681 |
|   C/Ma |  35402 | 16798 |  14027 | 12395 |   12189 |
|   C/Av |  69139 | 25579 |  23279 | 22235 |   21745 |
|  C/Bal |  26551 | 16187 |  15309 | 13198 |   12505 |
|   C/HM |   9400 |  9238 |   7226 |  6235 |    7706 |
|     Ha |  23534 | 19934 |  13139 | 12983 |   13479 |
|  Avoid | 293086 | 92635 |  74288 | 60576 |   49226 |
| Av/Mas | 289307 | 91627 |  70596 | 53821 |   42384 |
| Mas/Av | 198467 | 76123 |  58722 | 43421 |   34818 |
|   Ha/h |  14302 | 12451 |   9569 |  9155 |    9841 |
|  Ha/he |  10460 | 11118 |   9084 |  7801 |    9395 |
|  C/Str |  65271 | 26409 |  25203 | 22577 |   20106 |

Not bad.  We no longer have crazy TMI values in the 200 thousands, though the avoidance sets do get up near 100k.  But of the useful gear sets, the numbers are pretty reasonable.  C/Ha comes in a little over 7k, and it’s clear that C/St is a significant improvement at 3k while C/Sg is only a small improvement at a little over 6k.  Just like our qualitative assessments suggested.

Comparing bosses

Before we quit for the day, I want to demonstrate another quirk of TMI.  I mentioned a few paragraphs ago that I’ll be defining a “standard boss” in a future post.  I mentioned the reason why in passing in Parts 1 and 2, but now I want to formally explain why we need to do this.

First, consider what happens if we halved the size of the boss’s melee attacks.  The entire histogram would shift to the left because each spike just became half as large as it was before (if not smaller, thanks to absorb effects). That looks something like this, where we’ve reduced the boss’s melees from 350k (after mitigation) to 200k:


hist raw 200k

Health-normalized histogram for a boss that swings for 200k after mitigation.

And of course, if we then perform our weighted-average calculation on the histogram we get a much smaller number.  For example, here’s the TMI values we get with the above histogram:

pct=100.00, N=200, vary hdf

|    Set |   TMI |
|   C/Ha | 113.2 |
|   C/St |  81.8 |
|   C/Sg | 109.9 |
|  C/Shm | 126.8 |
|   C/Ma | 176.4 |
|   C/Av | 217.0 |
|  C/Bal | 156.8 |
|   C/HM | 118.2 |
|     Ha | 154.8 |
|  Avoid | 301.1 |
| Av/Mas | 286.0 |
| Mas/Av | 280.2 |
|   Ha/h | 132.3 |
|  Ha/he | 129.3 |
|  C/Str | 153.3 |

The relative ordering is the same, of course – that’s courtesy of our smart choice of an exponential weight function.  But the values are much lower than they were in the first table.  Now consider what happens if you compare the value for Av/Mas from this table to C/Ha from the first table.  It looks like Av/Mas wins, doesn’t it? But that’s only because we cheated, and weren’t comparing apples to apples.

In some sense, we were measuring using different scales.  The lower table could be in feet and the upper table in centimeters, for example, and there’s no doubt that 3 cm is less than 1 foot.  But if you leave off the units, it looks like it’s just 3 vs. 1.

We could normalize for this if we had to, just like we normalized out other factors.  But this one is more complicated and less useful.  First of all, what do we normalize by?  Boss melee attack size?  That’s all well and good until we start introducing magic damage into the mix, at which point our normalization doesn’t work correctly anyway.

We could normalized based on raw boss DPS from all sources before mitigation.  That seems like it should work, but introduces another wrinkle.  What if player A calculates their TMI with a boss doing 1 million raw DPS with only melee attacks and player B calculates their TMI with a boss that does 1 million raw DPS with only spells?  Will it be valid to compare the two?  Well, no, not really, because physical and spell damage function differently.  So even this normalization doesn’t make it any easier to compare TMI values across different situations.  We’d still need to specify what the boss details are, so we may as well have not bothered with the normalization in the first place.

There’s another reason I prefer not to normalize by boss DPS.  If we compare the bosses doing 350k and 200k damage per swing, the differences between gear sets are much smaller.  While the relative ordering is the same, it’s clear that the impact of changing from a C/Ha gear set to Av/Mas is not that big.  And that’s actually useful information!  It’s telling you that for this boss, there isn’t a huge advantage to any of the gear sets in terms of survivability.  In other words, it suggests that you significantly overgear the boss, which is a hint that you can start shifting to DPS stats.

So that’s why we have to specify a standard boss.  Rather than doing that now, I’m going to wait until I can collect all of the relevant specifications into a single blog post (along with the SimC implementation, which I haven’t finished yet at the time of this blog post’s writing).  But it will likely be primarily physical damage with a sprinkling of magic damage via a DoT effect.


There aren’t really any “conclusions” to draw from today’s post.  We were mostly fine-tuning the details of the metric we’ve developed in the last two blog posts.  But we can briefly summarize what we’ve done.

First, we discussed and developed normalization conditions for the metric.  This doesn’t actually change the results any, it just scales them to be more convenient.  Rather than comparing numbers like 0.012 to 0.024, we can use normalization to turn those into more easily-interpretable numbers, like 1200 and 2400.  The normalizations we’ve applied attempt to keep the value semi-reasonable under a wide range of simulation variables.

We also made note of the fact that the histograms we’ve been showing were an unnecessary middle-man in the calculation process, and removed them from the process when we provided the formal definition of the TMI metric.

Then, we tested the normalized function with a bunch of gear sets just to make sure the results still made sense and agreed with our previous results.  Of course, since normalization factors don’t change the relative values, there’s no way anything we did could have changed the results unless we screwed something up (spoiler: we didn’t).

Finally, we briefly touched on the reason that we need to define a “standard boss” for use with the metric.  The metric will certainly work with any boss definition you like, but the values you get out will depend heavily on how that boss is configured.  So if you want to be able to make comparisons (like between two different classes, for example), having a standard is really useful.

The next post on this topic probably won’t be until next week.  It will discuss the Simulationcraft implementation of the metric and any modifications we’ve had to make to get it working well.  It will also formally define the metric, including details on the standard boss, and provide brief instructions on how to load your character in Simcraft and calculate your own scale factors.

This entry was posted in Tanking, Theck's Pounding Headaches, Theorycrafting and tagged , , , , , , , , , , , , . Bookmark the permalink.

55 Responses to The Making of a Metric: Part 3

  1. Rohan says:

    Interesting work. Only comment I have is that from an aesthetic point-of-view, it would be nicer if greater values of TMI were better than lower values.

    Larger being better is a lot more intuitive, especially for an advanced stat which is highly technical.

    • Theck says:

      I agree, I like “bigger is better” too. However, the only sure-fire way of doing that would be to invert the result (as in, $x$ becomes $1/x$). And that introduces more volatility. DTPS already uses “golf rules,” so I think tanks will manage to wrap their heads around this one.

      Alternative way to think about it: TMI measures how bad spikes are, thus lower means fewer spikes -> better.

      • Rohan says:

        Only thing I can think of is to take the upper bound of possible TMIs, say the TMI of a non-tank gear set. Then subtract your current TMI from that upper bound.

        Then instead of measuring how bad the spikes are, the metric would measure how much the spikes are reduced by.

        Essentially, instead of measuring “damage taken per second”, measure “damage avoided or mitigated per second”.

        • Theck says:

          But then you’re implicitly requiring two simulations (one in non-tank gear, one in tank gear) to get the same information that we were previously getting in one.

          There’s also the issue that “non-tank” gear isn’t well defined anymore. It would have to be gear that doesn’t have hit, expertise, haste, stamina, mastery, strength, dodge, or parry on it. Even intellect and crit have small effects if you cast WoG at all. So that leaves…. agility and spirit? :P

          • Rohan says:

            No, I’m requiring N+1 simulations, where N is the possible number of tank sets. Once you’ve calculated the “worst-case” scenario, you can just reuse that number.

            It doesn’t have to be strictly worst-case, too. Just needs to be a baseline number. After all, if you came up with an even worse set, you’d just end up with a negative number, showing that the set is really bad.

        • Thels says:

          I don’t think that would make things more intuitive. Say the worst case scenario is 100000 (I’m actually expecting it to be higher, since the avoidance sets are pretty close to that already.

          Right now C/Haste and C/Stamina are at 7000ish and 3000ish. That appears as a pretty big difference. People can tell that 3000 is a lot less than 7000.

          Now if we subtracted them from 100000, we’d get to 93000 and 97000. Suddenly, those differences seem pretty minor.

          If higher numbers should be better, we’d have to use 1/x (before the multiplier, of course), rather than 100000-x. I’m fine with lower numbers being better, though.

  2. Zaephod says:

    There’s a typo on this line, I assume. “Well, no, not really, because magic and spell damage function differently.”

    Thank you for your hard work Sacred Duty Team. :) We all greatly appreciate it.

    • Theck says:

      I could have sworn I replied to this earlier, but I guess I accidentally canceled it.

      Yup. Meant to say “physical and spell damage.” Fixed now, thanks!

  3. Qaajn says:

    I like this better now that I got the last part too. Especially happy about moving away from the histograms. Never liked those. However I’m not yet convinced that this is a good general approach that would produce comparable results at different gear levels with the same damage patterns and across different classes.

    For obvious reasons, you couldn’t expect to get the same index for different tanks fighting the same boss, because we aren’t identical and perfectly balanced. And that’s fine. But if both tanks end up in the same state after a certain number of attacks now, the TMI would place them at the same rating, but is this a good way to measure chance to die? If one tank would take 4 strikes of 25% maximum health in damage, it would be far more likely to survive then a tank taking avoiding 2 and taking 2 consecutive for 50% of maximum health, even though their 4-point average would be the same. Tanking isn’t this simple though, and by those patterns one could assume that the second one would sometimes get a third attack for 50% damage, which would increase it’s TMI.

    What I’m looking for is something like “outside attention needed to survive the next attack” – but it’s still just an idea I develop. A tank should naturally want to do everything in their power to survive for one more strike against them. There are two ways to approach the goal of surviving the next attack. Either you could try to reduced the damage that previous attacks have made against you to make the next attack unable to kill you, or you can try to make the next attack not able to kill you. The first is more of an approach to reduce total damage taken, while the later is more for reducing spikes. Depending on the encounter your greatest risk of death may be either healers running out of mana (last strike kill), heavy soaking specials kill (took too much damage prior to top of) or short-sequence trickling death (too much damage from attacks while healers are distracted or unable to help). Ideally, the TMI should be a theory for everything, a tool which you could use to see how well different classes, playstyles or gearing make you perform in each particular area. For example, would it be worth to survive 10% harder spikes for 30% more damage taken overall? Depends, and it would be a choice you could make… But it is something I definitely think should be included in a unified theory of tanking.

    It is already doable to create a tool that evaluate the total damage you would take in a certain fight, the only problem would come from how to get the tanking mechanics used smartly. Surviving bursts and trickles are similar in their approaches, the only difference is the former have to deal with one (or more) significantly harder hits and often come at least somewhat spread. Ideally, we could simulate healing, but it would be a lot of effort and too much analysis of different fights and class setups to be worthwhile even trying. Since we can’t simulate healing, we should ask ourselves, do we need to? The goal as a tank is not to optimize healing taken, but to survive. By taking this standpoint, we can assume that we at any given point in the simulation should be alive; if we are not, what’s the use? But how would we continue from this?

    We have to assume that what kills us is not getting healed, because we do need healers. There is no way we can affect when we will get healed, and there is usually a good reason when we aren’t. This give us the goal to try to survive as long as possible with no outside help. From the simulation we should be able to get a distribution of how frequent and short the time it would take the boss to kill us would be. It would probably to simply include all events which could kill us, even if we get overlap.

    To make a quick example, if we have a string of attacks dealing 20-30-40-40-40% damage, the events 2-4 would be one danger, 3-5 another.

    Tossing this out here now so that more can see it. Need to think further on this.

    • Theck says:

      “But it is something I definitely think should be included in a unified theory of tanking.”

      Hold on just a second there. Let’s be clear that this is NOT a “unified theory of tanking.” Nor was it ever meant to be.

      It is simply an attempt to quantify damage smoothing effects, much the same way that we quantify damage taken with DTPS or TDR. It is but one piece of the jigsaw puzzle that is tanking.

      A unified theory of tanking would require a variety of inputs: DTPS, TMI, DPS, and possibly even HPS. And as you suggest, you could then apply weight factors to all four of those, which would answer questions like the ones you pose: “Is it worth taking 30% more damage to survive 10% larger spikes?”

      We’re a long way from that yet, though I can see something like that happening eventually. Once you can generate all four of those metrics (and scale factors for each of them) in a single run, it would be possible to provide user-definable “metric weights” to achieve exactly what you’re suggesting.

      I don’t think we’ll ever see a true “universal theory of tanking” simply because I don’t think those weights are constant. A 10-man tank will have a very different relative value for DPS and TMI than a 25-man heroic tank will.

      Regarding healers: that’s one of the advantages of moving to SimC. In theory, we could run a simulation with each different type of healer to see how the damage patterns differ and which stats are strong in each situation. Right now, I think only the priest module is supported, but most of the nuts and bolts are there for the other classes already.

      • Qaajn says:

        I know it isn’t close to there yet. But it would be nice if we had, and I hope that you will work towards it!

        The problem I see with healers is that it’s never one healer healing only one tank. While it shouldn’t be extremely hard to model a healer healing a tank, it is everything around it that makes it a complete mess. Just healing a tank doesn’t usually drain your mana too badly, nor does your focus drop much. Both of those come from mechanics in fights that require you to heal the raid or avoid things. While all this -could- be modeled, you would essentially need to make simulations for each boss with each combination of healers for each evaluation you try to do. And this is just to see how tanks should best survive… You would also need the healers playing properly and the raid to get accurate estimations on mana and time useage to keep them alive. Essentially, you would need to fully simulate all boss fights.

        But you don’t need to know when a heal will be incomming to evaluate your survival. What you need to know is what playstyle/gear allow me to take as little and smooth damage as possible, preferably while maximising DPS. If you choose the best option and still die, it is either a gear, skill or encounter handling problem. All of those are a player issue and not a fault with the model itself.

        This is why I think that you should evaluate (or just plot) death times against frequency instead of damage for set sequence lengths. You would still use the original model to arrive at what series of events would kill us from full health, but instead of seeing how much and often they would hurt us, you would see how quick and often they could kill us. If for example it turns out that when the RNG Goddess make us unlucky, we could on a certain boss die within the timeframe of 2 attacks, then that would certainly be something to consider gearing against, even if it’s rare. on the other hand, if there are a few times where we could be unlucky and die in 10 seconds if left alone, but the majority is even higher, then you could probably look to switch survivability to damage for example. I believe the plots would appear very similar to your histograms, only instead of % of maximum health, you would have time on the axis.

      • Bram says:

        It is not “unified theory of tanking” yet, but you know that it is holy cow of every science to bring “unified theory of life” together that would provide answer on every question. Just admit, we might in very distant future land there, if you continue with this speed. :)

        • Kal says:

          I think a unified theory of tanking would be the metaphorical forest, while these metrics are not even the trees. They’re the leaves.

    • Thels says:

      If one tank consistently gets 25%-25%-25%-25% and another tank consistently gets 50%-0%-50%-0%, then they are pretty much even for survivability. However, it is highly unlikely that the 2nd tank will always get them aligned nicely like that. There will be strings of 0%-0%-0%-0% and strings of 50%-50%-50%-50%, and with the added weights, that will give the second tank a much higher TMI than the first tank, so this is already accounted for.

      As for which strings of attacks exceed 100% of your health and which strings of attacks don’t, that really doesn’t hold that much value. First off, there’s all kinds of random raid healing going on all over the place. But more importantly, the boss damage is chosen arbitrary! If we really look at which strings of attacks kill us, then slightly upping or lowering the boss damage would have a big effect of the weight of our stats.

      • Qaajn says:

        Yes, TMI does a good job to take account for damage reduction and bad luck, so that’s a good thing. My main concern is that the index, while ordered correctly, is arbitary. Avoidance is (probably) not 10 times more likely to kill you then mastery. My wish is that a model could be developed to analyze the fight data to see when you risk dying the most, and how to minimize it.

        Ofcourse, but sometimes there aren’t much raid healing comming your way, and even if there is, you still want to minimize the impact of deadly events to make less healing required on you to prevent death. Because of this, it doesn’t really matter so much how much healing you would get in a certain situation, we aren’t trying to figure out if you actually survive or not. All we try to do is giving us the best chances to survive as long as possible to give the healers time to get us back up. For this reason I think a TTD (Time To Die) distribution would be a good tool for gear choices. This would, as you point out, vary for each boss. While we can make general assumptions for “this is better then that” we can’t say how much better it is… Unless we simulate the damage for each boss. Which I think we should, atleast for those where it is tank death that is the greatest risk, and see how we can affect those.

        • Qaajn says:

          Having an index to measure your survivability is good, but it would be even better if you could understand just what it meant in reality. It would be very clear to everyone if one tank could die in 3 seconds on a certain boss, while another can’t die quicker then 5 seconds. It is easy to relate to.

          • Jackinthegreen says:

            Your numbers are a bit off since Blizz specifically designs bosses such that they don’t 2-shot decently-geared tanks. 3-shot, maybe, but they have to be relatively high powered hits or the tank doesn’t have the gear assumed for a base level to run the boss. I remember the Wrath days when they really could 2-shot a good tank (and in fact were tuned specifically that way) because they had to do it that way to keep healers engaged.

            As far as unifying things go, perhaps you’re underestimating how monumental a task it is to get that going on one tanking spec, let alone each of the five in the game, and possibly sixth if or when they add another class. Trying to push that kind of thing right now is like pushing the construction of Mount Rushmore to get done quicker. There are simply certain parts of the construction that can’t be done quicker, and it can’t be done all at once either.

            Yes, we’ll eventually get to that point because it’s an obvious final goal, but at least let the man get some rudimentary stuff together first!

          • Qaajn says:

            @Jackinthegreen – Of course they were probably exaggerated for most bosses and situations. But it did make the point very clear though, didn’t it?

            Especially when faced with monumental tasks, it’s important to make sure you approach them in the right way, or you may end up realizing that you need to redo most of your work much later. I’m not saying that the way I propose to measure tanking survability would be better, but it is a second way to view it, and it would to me be easier to understand also how much better one tank is then another.

          • Thels says:

            TMI in it’s current form would be a uniform result. Value X will be lower than value Y, so you’re less likely to die when wearing the set that gives you value X and handle your rotation properly.

            Adding TTD would make the results very situational, and people might plug in some details, discover that option A is the best, while actually, thanks to healer mechanics and what not, option B would be a better option.

            Note that since smoothing is based around our health, this already includes TTD to a limited degree. Strings of attacks that exceed 100% of our health count stronger than strings of attacks that do not. Strings of attacks that far exceed 100% of our health count stronger than strings of attacks that barely exceed 100% of our health, and are probably survivable on mere AoE healing.

        • Theck says:

          I don’t think you’d see a particularly large difference on a TTD distribution. One of the examples you keep emphasizing is a comparison between 25-25-25-25 and 50-0-50-0. While it’s a fair point that the first is smoother than the latter, and thus probably safer, you’re also not comparing apples to apples. The boss that can do 50-0-50-0 could also do 50-50-50-50, which would show up much higher in the distribution and be weighted more heavily.

          Again, once specials with cooldowns are involved, you may be motivated to shorten the relevant time window for a TMI calculation, partly based on a “Time to Die” type estimate.

          But TTD is not a particularly new metric. In fact, it’s just a simple EH metric. The reason it sees very little use since Wrath is because it’s become less relevant over the years.

          You rarely die because you went from 100% to 0% while receiving zero healing anymore. In almost every situation you’ll have HoTs, incidental smart healing (Atonement), and low-throughput heals (Holy Light) landing on you during that descent from 100% to 0%. What kills you is a sudden increase in instantaneous DTPS (i.e. a spike) without an appropriate elevation of incoming HPS from the healers.

          At that point, TTD isn’t very useful. It’s a rough estimate of spike survivability, of course, but knowing that you can live 3 seconds without heals doesn’t mean much if incidental healing extends that to 6 or 8 seconds. Especially since a different situation (player, class, damage profile, etc.) might have a TTD of 4 seconds that only gets increased to 5 seconds by the same sort of incidental healing.

          In short, I just don’t have the same faith that you do in TTD as a “better” smoothness metric. EH and TTD have their relevance, but they are not measures of damage intake patterns or smoothness. The point of these articles wasn’t to develop a grand unified theory of tanking; it was to fill a void in theorycrafting, namely the lack of a good metric that *does* measure smoothness.

          I could imagine a different normalization scheme for TMI that frames the result in terms of seconds (or more likely milliseconds). But that would be just as arbitrary as what we’re doing, and I’m not sure it actually conveys any extra information. In fact, I’d be worried more of the opposite – that it would falsely suggest information that the metric doesn’t contain.

          • Qaajn says:

            I agree very much that -just- TTD is not at all useful. But that’s not what I’ve been suggesting either. Perhaps the idea behind the different approach was lost somewhere, so I’ll quote two sections from my first two replies in part 3:

            “…get a distribution of how frequent and short the time it would take the boss to kill us would be.”

            “…I think that you should evaluate (or just plot) death times against frequency …”

            What I tried to suggest was in essence a TTD distribution. It would likely look similar to your original histograms, though it probably wouldn’t be a normal distribution. Tried finding out the name of what I thought, had found a shape looking like I would guess, but it’s not in English. It would likely have an (almost) non-existant “front” tail, it should start from non-zero, and it would have a longer back tail… I found a decent picture of it at least, look at the solid fat black line:

            You could naturally do some sort of manipulation from there to get a single number to determine smoothness, just like you did to get TMI. Personally I’m a graphical person though, and would get much out of seeing the shape of that function and trying to tweak it to start at higher times and have a broader peak.

          • Thels says:

            But that would have no meaning whatsoever…

            You seem to imply that a series of attacks that hits you for 102% of your health is a LOT more dangerous than a series of attacks that hits you for 98% of your health. Sting A of attacks “kills me”, so it should be flagged, while string B of attacks doesn’t “kill me”, so it doesn’t have to be flagged.

            But it’s not. Sure, it’s slightly more dangerous, but only slightly. For one, there’s a constant amount of raid healing going on, so that string of attacks that does 102% of your health won’t kill you even if your healers aren’t paying attention.

            But more importantly, the boss damage output is chosen arbitrary. In reality, not all bosses hit for the same amount every single swing. Therefor, the 100% breakpoint that you seem to insist on emphasizing, is really no more important than the 90% breakpoint, or the 110% breakpoint…

            TMI does give strings of heavy damage more weight than strings of low damage. It however does it gradually, and doesn’t make a sudden huge leap at 100%, since there is no reason to make that sudden huge leap at 100%.

          • Qaajn says:

            It’s a good thing that you are critical, and constructive feedback is always welcome. I’ve actually have pondered about how that could be included, but haven’t come up with a good way (yet)…

            But your post gave me an idea how to do it, though I need to bring out my calculator and think on it, to see if it works.

            Granted this makes the TTD arbitary, you won’t get any “I could live -this- time” – though you could naturally remove the weighting if you want those numbers.
            Time and damage+”overkill” weighting against frequency:

            Basic framework
            1) Assume no outside help
            2) Make a string of events from full health that continue until the event that would kill you
            2.1) Calculate time normalizer from the string
            2.2) Combine time with normalizer for a “time value” on your survivability
            3) Repeat 2) until you have started with all events
            4) Plot all time values and evaluate from shape or further analysis

            Calculating time normalizer
            1) Normalize each event as % of maximum health
            2) Assume that the first event of the string does damage from full health
            3) Normalize each event as (% damage) x (% remaining health after event)
            4) Add all events in the string together

            Calculating “time value”
            1) Multiply time normalizer with the duration of the string

            25-25-26-26 (dead) – 3 time units have passed
            .25(1-.25) + .25(1-.5) + .26(1-.76) + .26(1-1.02) = .3697
            .3697 x 3 = 1.1091
            Compare to your non-death example:
            24-24-25-25 (not dead) – 3 time units
            .24(1-.24) + .24(1-.48) + .25(1-.73) + .25(1-.98) = .3797
            .3797 x 3 = 1.1391

            One better then the other, but the difference isn’t huge. I did find a flaw though, namely damage dealt that doesn’t affect the total time of the string (like weak dots ticking – checked with one 4% in the middle of the string). It does weaken the time value, but the effect is minor (they jump to 1.1019 and 1.1355 respectively). Also note that it doesn’t matter in what order you place the values. I’m not sure if that’s a good thing or bad…

            There you have something… Need to think more though. I hope my ramblings give you some kind of useful input regardless ^^

          • Theck says:

            Interesting thought, but it’s not very consistent. In particular, it plays poorly with self-healing. Your algorithm in MATLAB is:
            a=[24 24 25 25];

            Here’s the results with a number of different strings:
            a=[24 24 25 25]; result=1.1391
            a=[25 25 26 26]; result=1.1091
            a=[10 20 30 38]; result=1.0728
            a=[10 20 30 42]; result=1.0248
            a=[0 0 0 98]; result=0.0588
            a=[0 0 0 102]; result=-0.0612
            a=[0 0 -50 148]; result=-2.1612
            a=[0 0 -50 152]; result=-2.3412
            a=[50 48 0 4]; result=0.7764
            a=[50 48 0 0]; result=0.7788
            a=[50 48 -50 50]; result=0.0288
            a=[50 48 -50 54]; result=-0.0336

            The metric doesn’t depend on the order, but it is very sensitive to the overkill value. I might agree that the [0 0 -50 150] sequences are the worst here, but the last four highlight a fairly significant instability in your metric. Taking 50 damage and healing it back up shouldn’t be all that different from taking zero. Slight changes wouldn’t be a problem, but this suggests that [50 48 -50 50] is worse than [0 0 0 98] and much worse than [50 48 0 0]. And I’d disagree with that assertion.

            I also don’t necessarily agree that [25 25 25 25] is that much less dangerous than [50 0 50 0]. Certainly the latter is spikier, but the danger to a tank is about the same as long as those zeros are fixed (i.e., you cannot have [0 50 50 0] or [50 50 50 0]).

            TMI handles this gracefully with the moving average. It doesn’t distinguish between [25 25 25 25] and [50 0 50 0], which seems to be your main complaint. My stance is that this is by design, because they aren’t that different.

            Your earlier argument was that TMI doesn’t take into account damage patterns within the window, and that we should distinguish between e.g. [0 50 50 0] and [50 0 50 0]. Again, I would disagree, because this is already taken into account in TMI.

            If the boss is truly limited to [50 0 50 0], such that successive boss attacks cannot happen on adjacent swings, then it isn’t that much more dangerous than [25 25 25 25], and the metric reflects that.

            If the boss *can* perform [0 50 50 0], then it can ALSO perform [50 50 50 0] or [50 50 50 50], both of which greatly inflate the TMI value to reflect the increased danger. You can’t simply cherry-pick the [0 50 50 0] scenario and use that to determine the metric is unhelpful, because that’s not considering all of the data.

          • Qaajn says:

            I agree that it’s not very consistent, sadly. Found out when I started playing around with the numbers, but I was halfway through the post so I figured it might still give some food for thought. TMI is growing on me, but I would somehow want to get time in there some way…

          • Thels says:

            If you think the 6 second window is too long, you could always try measuring TMI-4 for yourself, in various gearsets, and play with that.

          • Qaajn says:

            I do have MATLAB, so I could give it a try, though I play a DK and if I understood correctly they are not (yet?) supported. While I could make up random strings of numbers myself, that would hardly be useful.

            I’ll stick to logical thinking for now. When I thought more on TMI during my posts I must admit that it works better then I first thought – however, as there is not any significance in the order of attacks, it works well primarly for fairly similar spacing and base damage of attacks. That is what I so far have been (unsuccessfully) to figure out, a way there order and “density” of attacks makes a difference. As you have pointed out, a string of [0 50 50 0] is more dangerous then [50 0 0 50] which I think currently TMI makes no difference of (ignoring that potential strikes outside of current average may be affected). Similarly it doesn’t matter if you would have all the strikes focused in the last second of the average or evenly spaced out.

            That said, TMI works well for ordinary melee attacks, although I could wish the index itself were easy to translate to actual ability to survive a given situation. I would love if we could figure out a metric that is general and works well for any given situation and were easy to understand just what the outcome signify. But I realise that may only be a dream. Will keep thinking about it though.

          • Thels says:

            A string of [0 50 50 0] and a string of [50 0 0 50] are probably just as dangerous, because they don’t happen in a void. After that 50 0 0 50, you’re likely to get another 50.

            Of course, if you constantly get 50 0 0 50 0 0 50 0 0, then the latter is less dangerous, but that will become apparent, because for every [50 0 0 50] you’ll have [0 50 0 0] and [0 0 50 0]. It would also be odd for attacks to be spaced out like that.

            I’d be pretty interested in seeing the amazing stuff that Theck does for paladins done by someone for DKs too, especially to see the effect that Haste has on Blood.

          • Qaajn says:

            I should probably have compared [0 50 50 0] with [0 50 0 50] instead, but as pointed out, for this particular sequence, the former are likely more dangerous.
            But if we compare those strings:
            [0 50 50 0 0 50 50 0 0 50 50 0]
            [0 50 0 50 0 50 0 50 0 50 0 50]
            By my understanding, TMI would give them the same score. Ofcourse strings consistently looking like this would be odd, but if they did, I think that the first should get a worse score, as the damage is more condensed in smaller time frames. How to implement that is another question…

          • Qaajn says:

            As a note on what effect haste have on DK, by some rough estimations done on EJ it seems place itself somewhere either slightly above/below dodge/parry. Which would mean that it’s may be either the last stat (if it is worse and you prefer Hit/expertise softcap) to third stat (after stamina/mastery if you don’t prefer hit/expertise softcap).

          • Thels says:

            The [50 50 0 0 50 50 0 0] would be the odd ball out, and indeed show up just as dangerous as the [50 0 50 0 50 0 50 0], though that would be very niche. Perhaps a little less so with Blood DKs, if they can use their Blood Shield to cover every 2 out of 4 attacks, but it still feels very arbitrary.

            Note that changing to TMI-4 or some other option would certainly be to the disadvantage to [50 50 0 0 50 50 0 0].

          • Qaajn says:

            Indeed, that example is pretty arbitrary, but it does show on the weakness in TMI, mainly that the order of attacks in the string doesn’t matter. You can choose another TMI to make adjust to it, but how do you know that it matters? As a DK on dangerous content I typically wait for 2-3 attacks to connect in a row before I use Death Strike. This would give a shield for roughly 60-90% of a strike (with only 150% mastery) even with only a single death strike, so our damage patterns would naturally look more arbitrary. Assuming for simplicity a 100% shield instead, it’s easy to get patterns like [X X X 0 0 0] where one of the zeros would be an absorbed attack instead, effectively “smoothing out” spikes in avoid patterns… That particular string would be 4/6 strikes connecting, or 33% avoidance. Not very far of our actual numbers. I have 33% dodge+parry before diminishing returns with only 513 item level for example…

          • Theck says:

            I’m not sure why you keep painting that as a weakness of TMI. There’s a very easy answer to “but how do you know that matters?” If you’re simming your character, presumably you have some idea how hard the boss is hitting you per swing. If every swing is over 50% of your health, then of course you would care about a shorter time window. If every swing is only connecting for 30% (which is more reasonable, even for heroic bosses), then the default window is about right.

            I mean, you keep taking this niche case of a weirdly-inconsistent boss special and trying to generalize it to “TMI is a weak metric.” It’s a bit disingenuous to specifically craft edge cases while simultaneously ignoring the versatility of the metric to handle those cases.

            It’s like saying “DPS is a weak metric” because you create a situation where you only care about damage done in a 25-second window (say, Heroic Spine of Deathwing). Sure, it’s less relevant there, so you have to adjust your metric slightly to use it properly in that situation. But nobody abandoned DPS as a metric because of that one encounter, because it works very well in the vast majority of fights.

          • Qaajn says:

            “I’m not sure why you keep painting that as a weakness of TMI.”

            Don’t you think it’s a weakness that it doesn’t matter in what order the strikes hit and miss? It seems pretty clear to me that it matters much while tanking. Perhaps it’s too hard to get a good metric for it, but you should always be aware of the limitations of your estimations.

            “…trying to generalize it to “TMI is a weak metric.” It’s a bit disingenuous to specifically craft edge cases while simultaneously ignoring the versatility of the metric to handle those cases.”

            I’m not saying that it is a weak metric. In fact in my recent posts I have said that it’s growing on me. It does, however, have a weakness, one I would hope could be addressed to make it -more- versatile. That is not to say that the metric is weak for what it does, which it is not. But it could be much more.

            “It’s like saying “DPS is a weak metric””

            DPS as a metric have a weakness in that it doesn’t count if you do damage in a useful way at a proper time. An example could be either your spine for value of burst, or multidotting on Megaera, which pad the meters, but resulting procs make it a single target gain. There is a reason that tanking doesn’t measure ability in damage taken or healers in HPS, because we do not work the same way as DPS does. Less damage taken/more healing done doesn’t always make us beat an encounter easier.

            “If every swing is over 50% of your health, then of course you would care about a shorter time window. If every swing is only connecting for 30% (which is more reasonable, even for heroic bosses), then the default window is about right.”
            “…this niche case of a weirdly-inconsistent boss special…”

            If you want TMI to become a metric used by tanks outside of paladins, then it need to be able to handle so called “niche” cases. As I tried to explain, patterns such as [30 30 30 0] are very common for Death Knights, because you often wait for 2 or 3 attacks to connect in a row -before- you even Death Strike once. Even if you didn’t get any outside heals, after the first 3 strikes you would be at 28% health and a Blood Shield that would very unlikely be less then 27%, often a decent chunk more (depending on your mastery). If you death strike twice, then you would land at 46% health and more then 54% shield. If strikes then would continue to hit you, you would need a minimum of 7 connecting strikes total before you could die. Should we then always use TMI-10.5 for 30% strikes, to see how many of the bad streaks we can’t deal with properly?

            This example, with 2 death strikes ready, would have looked something like this:
            [30 30 30 0 0 30 30]
            If we instead, through bad play or need, would have just have DS on cooldown at the start, then it would instead become:
            [30 30 30 30 30 0 30] This is naturally worse, but pretend we get lucky and parry the last strike, as we didn’t have our double DS here. In the TMI-10.5 we then get 5×30% strikes again for the same index, but likely much more deadly then the first.

            This is the weakness I try to point out, that TMI would give the same index to different situations, that arguably are not as lethal as each other. Perhaps this is not something that can be dealt with in a good manner, but the concern should not be dismissed before it have been evaluated.

          • Theck says:

            No, as I’ve said before, it isn’t a weakness because for most cases the order of the strikes *within the appropriate window* does not, in fact, matter. If you are concerned with [50 50 0 0] vs. [50 0 50 0] then a 4-attack window is not the appropriate size in the first place, because you’re already in serious danger of dying in 2 or 3 attacks. So a TMI-3 or TMI-5 would be a more appropriate use of the metric, and in both of those the [50 50 0 0] will show up as much more dangerous than the [50 0 50 0].

            The metric is already very versatile as long as you’re not constraining your view to the default TMI-6. Just as DPS is a very versatile metric as long as you don’t limit yourself to only ever considering time-averaged damage over an entire boss encounter. DPS in a 30-second burst window is a perfectly appropriate use of the DPS metric for cases where you care about damage done in a 30-second window, just as TMI-4 or TMI-5 would be appropriate for cases where you’re *actually in danger of dying in 4 or 5 seconds instead of 6*.

            I’m giving you a ruler marked with millimeters, centimeters, and meters. You’re insisting that the metric has a weakness because it gives you funny results when you try to measure the size of a penny with the meter graduations. I would humbly submit that the tool I’ve handed you just works fine, you’re just using that tool incorrectly.

            Also note that the metric does in fact include healing. So a death strike that heals you for 30% of your health would show up as a -30 somewhere, i.e. [50 50 -30 0] or [50 20 0 0] depending on which time bin it lands in. So with proper reactive healing, that [0 50 50 0] might become a [0 50 20 0] while the [50 0 50 0] becomes a [50 0 20 0]. At that point, both look fairly similar in terms of danger, and the metric’s window-order-invariance seems very appropriate.

          • Qaajn says:

            Unless the metric give consistent indices’s for different time windows for the same damage patterns, then you can’t compare them. The same simulation isn’t any more dangerous if you use TMI-3 or TMI-9 for evaluating it. Because of this, you can’t know what time window is appropriate unless you look at the raw data, and can’t decide what window you should use.

            A ruler can measure size, but it’s not even a good way to measure small objects. You’ve given me a ruler, while what I need is a Vernier caliper.

            I think it would be inappropriate to have healing show up as a reduction in a previous hit. Healing is by its nature reactive, and should not be applied before it happens. If we include all healing, then [50 50 -30] would have killed you, because the healing would happen after death, while modeling it as [50 20 0] keep you alive. Also note that a 30% DS heal require 150% damage to be taken in the last 5 seconds. I recall you having modeled absorbs as healing for now. It may be valid for small absorption effects which effectively just work as damage reduction, but big absorbs would be better modeled as avoidance as they result in complete strike removals. In a sense, they make you more “spiky” as they lead to longer periods of 0 damage taken.

            If you find my concerns irrelevant I’ll stop posting though.

          • Theck says:

            Certainly you can’t compare TMI-3 to TMI-6, but that’s not the point. I don’t see that as a reasonable objection. Nor is the fact that you need to know the raw data a valid objection, in my mind, because we generally have the raw data. We already know from in-game data that a boss hits us for X% of our life in a melee swing and how frequently that happens. That lets us make an informed decision about what time window is appropriate. This is no different than any other modeling of survivability.

            To use your own analogy: how do you know whether you need a ruler or a caliper until you’re given some information about what you’re measuring? If you don’t know whether you’re measuring an elephant or an ant, you can’t make an informed decision about what tool to use.

            You have it backwards, by the way – at first I was modeling healing as absorb bubbles, and later with a more sophisticated method that removes recent damage. This is more or less what we do in SimC, as it correctly models absorption and healing.

            And yes, while [50 50 -30] would have killed you and [50 20 0] would not, remember that we’re calculating this in the absence of other healers. The presumption is that there’s passive healing and absorption going on, which is why we don’t bother with tank health.

            That said, you’re again looking at specific situations. If the strings are [0 50 50 -30] and [0 50 20 0], in which case a three-attack average will clearly give a worse TMI result for the first than the second. And it’s just as likely that we have a [0 50 50 0] sequence, which will score badly as well. Once you simulate a long enough period of combat (or enough iterations in SimC), you will cover all of those possible situations and get a more accurate, aggregate score.

            And again, you’re still cherry-picking unrealistic situations. If your string is [50 50 -30], TMI-5 is the wrong measurement tool to be using in the first place, because TMI-3 is more appropriate. In other words, you’re still trying to measure the elephant with a caliper and complaining that the tool is broken.

          • Qaajn says:

            When attack size and frequency aren’t consistent through an encounter, then you can’t just look at two attacks and decide what window to use. It may also be largely irrelevant to use the same one for the entire fight.

            “To use your own analogy: how do you know whether you need a ruler or a caliper until you’re given some information about what you’re measuring? If you don’t know whether you’re measuring an elephant or an ant, you can’t make an informed decision about what tool to use.”

            I wouldn’t, that’s the point. Your tool need to be able to handle all situations it could encounter as well, or it will be limited to a certain set of situations.

            “…more sophisticated method that removes recent damage. This is more or less what we do in SimC, as it correctly models absorption and healing.”

            If you could explain how it is more sophisticated and correct I would appreciate that. I think some implementation of a health pool would be beneficial, and I have mentioned some ways it could be done in previous posts (have it reset after certain number of attacks for example).

            “And again, you’re still cherry-picking unrealistic situations. If your string is [50 50 -30], TMI-5 is the wrong measurement tool to be using in the first place, because TMI-3 is more appropriate. In other words, you’re still trying to measure the elephant with a caliper and complaining that the tool is broken.”

            I’m not sure where you got TMI-5 from. The only time I even used the number “5″ alone was as a reference to how Death Strike need 150% damage taken in the last 5 seconds for 30% healing. Never did I mention that TMI-5 should be used because of this or some other situation. I only used the 50% strikes in the first place as that was your example recently.

          • Theck says:

            Generally, when attack size and frequency aren’t consistent you only worry about modeling the most dangerous period. Again, it comes down to making informed decisions about the correct tool to use. Just like one wouldn’t choose class composition based on 10-minute time-averaged DPS on Spine of Deathwing, one wouldn’t use TMI-6 to evaluate smoothness for a boss that hits you for >50% of your health each swing.

            The tool *can* handle all situations when applied properly, just as DPS can handle pretty much any situation when applied properly. What you seem to want is a tool that takes all thought out of the process on the part of the user. And that simply isn’t going to happen, because we’re talking about a system far too complex to pull that off, even for a DPS metric.

            I have already explained why I don’t include player health in several blog posts. I don’t plan on reiterating all of those points in a comment, but in short: as soon as you assume health, your results are narrowed in scope by your choice of healer modeling. For example, the metric you proposed earlier assumes no healing at all and measures time until death based on health. But as a result, it ends up losing much of its validity for situations where you do receive healing in some form.

            Hence, rather than trying to model healers, since we can’t trust that our results using e.g. a Disc Priest healer are valid for a Holy Paladin healer, we abstract that. We simply try and look at how often our damage intake spikes up to a dangerous level, because those are the times when a healer needs to react. Our metric is focused around trying to quantify the frequency and magnitude of those events so that we can minimize them, and TMI does that very, very well.

            That’s also why we model healing as offsetting damage within a small time window (say 1s). Certainly in the case you’re cherry-picking ( [50 50 -30] ), you would have died before the heal went off without external intervention. But the point is that *we don’t care* that you would have died, because in a real scenario you probably would have received a HoT tick or some other incidental healing between those attacks. We are not trying to model your death, we are trying to model how many times you take a big damage spike so we can see how effectively different stats and effects reduce those events.

            I probably should have said TMI-4, not TMI-5, but in any event it was inferred from your string. [50 50 -30] is a 3-attack string, which takes >3 seconds of time. Thus, any time window >3 seconds will include at least 3 attacks in a row. The point is that when the boss can hit you for half of your health or more in one hit, you want to be using a TMI-# that limits you to two attacks, not three.

            But that’s really what it all boils down to – using the tool properly. If the boss can kill you within 2 seconds of time, then measuring smoothness over 6 seconds is somewhat silly. If the boss takes a minimum of 4 seconds to kill you, there’s no reason you would care about a 2-second moving average. If you were using any other tool to analyze your survivability, you would be adapting your method to the situation appropriately. I’m not sure why you’re complaining that TMI requires the same attention to detail.

            In short, you’re still complaining that the caliper isn’t equally good for measuring an ant, a squirrel, and an elephant. But I haven’t given you just a caliper. I designed the metric to be adjustable so it could be applied to a variety of situations. TMI-2 is your caliper, TMI-4 is your ruler, TMI-6 is your tape measure. It’s still up to you to choose the appropriate one to use.

            At best, I can imagine a composite TMI calculation that mixes TMI-2, TMI-4, and TMI-6 (or any other TMI-X combinations) according to boss hit size relative to health. The weight function would be sort of arbitrary though, and I’m not convinced it gives you any more information than simply choosing the most appropriate time window and using that.

          • Qaajn says:

            The most dangerous period doesn’t have to come with the fastest or hardest attacks, however. In my experience, it often come when healers are unable to or distracted by other healing requirements, which may or may not come from failure by the raid to properly react to something.

            I’ve read a few of those blogs, and I agree that it is harder to model health if you don’t model healing, and I touched on why that’s not a viable way to handle it myself. The metric I proposed earlier was very weak, because it didn’t give any consistent results. However, a new one got to my mind after my last reply as I was riding my bike.

            I called it floating TMI… I’m not sure if that even makes sense, but it could be considered healing required each second.

            Start each string with full health
            Continue with events until “death”
            Calculate overkill/second

            I’ll give a numerical example too. Image this string of attacks (and selfheals/absorbs)
            [0 30 0 30 30 30 -18 0 30 60 30 -24 15]
            For simplicity imagine 2 seconds between each event, the overkill/second would be
            [2 2.5 0.17 0.2 3.2 0.25 0.25 3.33 5 n/a n/a n/a n/a]
            The first number would be from [0 30 0 30 30 30] which is 120% damage over 10 seconds, while the highest would be from [30 60 30] also being 120% but over 4 seconds.

  4. Zaeron says:

    Hey Theck – Just a quick question.

    I’m nearing completion on my epic cloak quest, and I had been planning on taking the crit/haste/mastery cloak. In the light of the new info available about the tank proc (which seems hilariously OP) on the dodge/parry/mastery cloak, do you think I should suck it up and take the avoidance itemized cloak?

    • Jackinthegreen says:

      Keep in mind Blizz has hinted at letting players get more than one cloak since not allowing it would “penalize hybrids.” So you might be able to get both of them and then test which one feels better.

      As you said though, the proc is quite good assuming it just absorbs a big hit instead of a small DoT tick or something. Having an automatic “ohsnap” button might be worth giving up the 2.16% haste.

      It does have me curious whether the DPS cloak’s proc would proc SoI or perhaps even Battle Healer glyph.

    • Theck says:

      I don’t really know what to tell you. I think the proc is hilariously OP, so I will probably be switching to the tank cloak regardless of stats. I won’t be happy about it, mind you, but that proc outweighs the haste in my mind.

      Many player seem to disagree with me about that. They’re wrong, of course, but they’re welcome to their opinion. The reason I know they’re wrong is that Meloree and I agree that the proc is massively OP. Since Meloree and I rarely agree on anything, that means we must be right. Air-tight logic!

      I suppose my advice is “wait until the very last second to make a choice in the hopes that we get more information.” If it looks like we’re tied to the tanking one and the proc isn’t getting nerfed, take the tank one and be disappointed in your state-mandated double-avoidance itemization. Just like the rest of us will be after we sink 7k (or whatever it costs) to switch cloaks.

      If you’re not keen on the proc, or it looks like it’s getting nerfed (or that we can have multiple legendary cloaks?), then take the DPS one and enjoy the thrill while it lasts. Just keep that 7k in reserve for 5.4, I guess.

      • Meloree says:

        We agree on almost everything.. after we’ve argued for a while and you come around to my way of thinking.

      • Zaeron says:

        Thanks. I kinda figured that proc would outweigh any amount of stats, but I figured I should ask since I’ve been wrong about stuff before.

        I appreciate the advice!

        • Thels says:

          Has it been determined that the enchant is based on the cloak you chose, and not a separate option?

          If I can get the tank proc on the Haste cloak, I’ll be totally over that. :)

          • Jackinthegreen says:

            It’d be a bit bonkers if the proc wasn’t tied to the cloak, specifically because of the situation you’ve described of getting the tank proc on a DPS cloak.

          • Thels says:

            Yeah, you’re right about that. It’s too bad the agi tank one has crit, not haste, else the agi cloak might’ve been worth considering. :P

          • Jackinthegreen says:

            I think they probably did that intentionally so paladins wouldn’t pick it up. Though it would be a waste of the agility anyway since that stat no longer does anything meaningful for plate-wearers.

          • queldan says:

            Implying that Str does much beyond padding DPS meters :-p

          • Jackinthegreen says:

            Strength does have uses, though for survivability it’s not really that big. It boosts parry (queue the sound track where everyone gives a sarcastic “yay!”), and it also boosts SoI and WoG healing and SS shielding. Hast would have probably been better though, yes.

          • Thels says:

            Yeah, so if the question was Strength + Parry or Agility + Haste, I wouldn’t be so sure… But since it’s Crit on the Agi tank cloak, it’s a nonissue.

  5. Geodew says:

    I think you have a minor error. The equation for MA_i = blah sums from 1 to T*dt. I think it should be T/dt. If you have bin widths (aka time between samples or whatever) of 0.5 seconds and a 6 second window, you should be summing 12 samples, not 3. Right?

  6. Pingback: (Re)-Building A Better Metric – Part II | Sacred Duty

Leave a Reply