(Re)-Building A Better Metric – Part I

A few weeks ago, I posted a request for data to test out a new implementation of TMI. This follow-up post took longer than expected, for a number of reasons. A busy semester, wedding planning, and the Diablo 3 expansion were all contributing factors.

However, the most important factor is that the testing uncovered a few weaknesses that I felt were significant enough to warrant fixing. So I went back to the math and worked on revising it, in the hopes of hitting on something that was better. And I’m happy to say that I think that I’ve succeeded in that endeavor, to the point that I feel TMI 2.0 will become an incredibly useful tool for tanks to evaluate their performance.

But before I get to the new (and likely final) implementation, I think it’s worth talking about the data. After all, many of you were generous enough to take the time to run simulations for me and submit it, so I think I owe you a better explanation of what that data accomplished than “Theck changed stuff.”

To do that with sufficient rigor, though, I need to start from the beginning. If you recall, about nine months ago I laid out a series of posts entitled “The Making of a Metric,” which explained the thought process involved in designing TMI. Without re-hashing all of those posts, we were trying to quantize our qualitative analysis of damage histograms in table form. And most of the analysis and discussion in those posts centered around the numerical aspects of the metric. For a few examples, we discussed:

  • How we thought that a spike that’s 10% larger should be worth $h$ times as much in the metric (the “cost function” for those that are familiar with control theory or similar fields)
  • The problem of edge effects that were caused by attempting to apply a finite cut-off or minimum spike size
  • What normalization conditions should be applied to keep the metric stable across a variety of situations

and so on.

However, none of that discussion addressed what would eventually be a crucial (and in the case of our beta test results, deciding) factor: what makes a good metric? I was intently focused on the mathematics of the problem at the time, and more or less assumed that if the math worked well then the metric would be a good one.

Suffice to say, this assumption was pretty wrong.

What Does Make a Good Metric?

So when I sat down late last year to start thinking about how I would revise the metric, I approached from a very different direction. I made a list of constraints that I felt a good metric would satisfy, which I could then apply to anything I came up with to see if it “passed.” This is that list:

  1. First and foremost, the metric should accurately represent the threat of damage spikes. That actually encompasses several mini-constraints, most of which are numerical ones.
    • For example it should take into account spike magnitude and spike frequency, because it’s more dangerous to take three or four spikes of size X than it is to take one spike of size X.
    • It should filter the data somehow, such that the biggest spikes are worth considerably more than smaller ones are.
    • However, it also can’t filter so strongly that it ignores ten spikes that were 120% of your health just because you took one spike of 121%.
    • The combination of those three points means that it has to filter continuously (i.e. smoothly), so we can’t use max() or min() functions.

    In short, this is basically the numerical constraints that I applied to build the original version of TMI. Ideally, I would like it to continue generating the same quality of results, but tweak the numbers to change the presentation.

  2. It should work seamlessly in programs like Simulationcraft and on sites like World of Logs, Warcraft Logs, and AMR’s new combat log analysis tool. Working in Simcraft is obvious. That was one major reason I joined the SimC dev team. But wanting it to be useful on logging sites is a broader constraint – it means that it needs to work in a very wide range of situations, including every boss fight that Blizzard throws at us. If it’s only useful on Patchwerk under simulated conditions, it’s probably not general enough to mean anything.

    This also means that it should work with SimC’s default settings. I want to have to do as little messing around with SimC’s internals as possible.  This will come up again, so I want to mention it explicitly here.

  3. It should generate useful stat weights when used in Simulationcraft. One of the primary goals of the original metric was to be able to quantify how useful different stats were. If the metric produces garbage stat weights, it’s a garbage metric.

  4. Similarly, it should produce useful statistics. Another major drawback of the old version was that the TMI distributions were highly skewed thanks to the exponential nature of the metric. That meant that the distribution in no way represented a normal distribution, which made certain statistical measures mostly useless. A new version should (hopefully) fix that.

  5. It should be easily interpreted. Ideally, someone should be able to look at the number it produces and immediately be able to infer a meaning. Good, bad, or otherwise, you shouldn’t need to go to a blog post to look up what it means to have a TMI of 50k.

    I was never very happy with this part of the original metric. The meaning wasn’t entirely clear, because it was an arbitrary number. You’d have to read (and remember) the blog post to know that a factor of 3 corresponded to taking spikes that were 10% of your health larger (i.e. 80% of your health to 90% of your health should triple your TMI).

  6. Ideally, the numbers should be reasonable. This was arguably the biggest failing of the original version of TMI, and something that Wrathblood and I have argued about a lot. While it’s nice mathematically that a bigger spike creates an exponentially worse value, the majority of players do not think in orders of magnitude.

    I may have no problem understanding a TMI going up from 50 thousand to over 1 million as a moderate change, because I’ve been trained to work with quantities that vary like that in a scientific context. But the average user hasn’t been trained that way, and thus saw that as an enormous difference. Much larger than going from 2.5k to 50k, even though it is an equivalent change in spike size.

    The size of the change was part of the original goal, of course – to emphasize the fact that it was significantly worse to take a larger spike. But that’s not how the average user interpreted it. Instead, their initial reaction was to assume that the metric was broken. Because surely they hadn’t suddenly gotten 20 times worse just by bumping the boss up from normal to heroic. Right? Well, that’s exactly what the metric was saying, and should have been saying, when their spike size went up by ~28% of their health. But the message wasn’t getting across.

    In retrospect, I think I know why, and it was tied to item #5. The meaning of the metric wasn’t entirely clear. At least to someone who hadn’t gotten down and dirty with the math behind the metric. So instead, they assumed the metric was in error, or faulty, or something else.

Those were the five major constraints I set out to abide by in my revisions. Pretty much anything else I could come up with was covered by one or more of those, either explicitly or implicitly.

Now, with this rubric, we can take a look at the results of the beta test and see how the original revision of the metric performed. But first, I want to talk briefly about the formula I chose to use for those that are interested. Fair warning, the next section is fairly mathy – if you don’t care about those details, you may want to skip to the “Beta Test Results” section.

Beta Test Formula

Let’s first assume we have an array $D$ containing the damage taken in each time bin of width $dt$. I’m going to leave $dt$ general, but if it helps you visualize it just pretend that $dt=1$, so this array is just the damage you take in every one-second period of an encounter. We construct a $T$-second moving average array of that data just as we did in the original definition of the metric:

$$\large MA_i = \frac{T_0}{T}\sum_{j=1}^{T / dt} D_{i+j-1} / H$$

The new array $MA$ created by that definition is essentially just the moving average of the total damage you take in each $T_0$-second period, normalized to your current health $H$. By default I’ve been using $T_0=6$ as the standard window size. Again, nothing about this part changed, it’s still the same array of damage taken in each six-second period for the entire encounter.

If you recall, the old formula took this array and performed the following operation:

$$\large {\rm Old\_TMI} = \frac{C}{N} \sum_{i=1}^N e^{10\ln{3} ( MA_i – 1 ) } = \frac{C}{N}\sum_{i=1}^N 3^{10(MA_i-1)}$$

Where $C$ was some mess of normalization and scaling constants, and $N$ was the length of the $MA$ array.

This formed the basis of the metric – the bigger the spike was, the larger $MA$ would be, and the larger $3^{10(MA_i-1)}$ would be. Due to the exponential nature of this function, large spikes would be worth a lot more than small ones, and one really large spike would be worth considerably more than lots of very little ones.

The formula that I programmed into Simulationcraft for the beta test was this:

$$ \large {\rm Beta\_TMI} = c_1 \ln \left [ 1 + \frac{c_2}{N} \sum_{i=1}^N e^{F(MA_i-1)} \right ] $$

where the constants ended up being $F=10$, $c_1=500$ and $c_2=e^{10}$. Let’s discuss exactly how this differs from ${\rm Old\_TMI}$

It should be clear that what we have is roughly

$$ \large {\rm Beta\_TMI} \approx c_1 \ln \left [ 1 + \chi {\rm Old\_TMI} \right ]$$

where $\chi$ is some scaling constant. That statement is only approximate, however, because ${\rm Old\_TMI}$ used a slightly different exponential factor in the sum. In the old version, we summed a bunch of terms that looked like this:

$$\large e^{10\ln 3 (MA_i-1)} = 3^{10(MA_i-1)},$$

while in the new one we’re raising $e$ to the $F(MA_i – 1)$ power:

$$\large e^{F(MA_i-1)}.$$

In other words, the constant $F$ is our “filtering power,” just as $10\ln 3 $ was our filtering power in ${\rm Old\_TMI}$. The filtering power is a little bit arbitrary, and after playing with the numbers I felt that there wasn’t enough of a difference to warrant complicating the formula. By choosing $F=10$, a change of 0.1 (10% of your health) in $MA_i$ increases the value of the exponential by a factor of $e\approx 2.718.$ For comparison, in ${\rm Old\_TMI}$ increasing the spike size by 10% increased the value of the exponential by a factor of 3. So we’re not filtering out weaker attacks quite as strongly as before, but again, the difference isn’t that significant. The main advantage to doing this is simplifying the formula, that’s about it.

So with that caveat, what we’re doing with the new formula is taking a natural logarithm of something close to ${\rm Old\_TMI}$. For those that aren’t aware, a logarithm is an operation that extracts the exponent from a number in a specific way. Taking the log of “base $b$” of the number $b^a$ gives you $a$, or

$$\large \log_b \left ( b^a \right ) = a$$

There are a few logarithms that show up frequently in math. For example, when working in powers of ten, you might use the logarithm “base-10,” or $\log_{10}$, also known as the “common logarithm.”  If what you’re doing uses powers of $e$, then the “natural logarithm” or “base-$e$” log ($\log_{e}$) might be more appropriate. Binary logarithms (“base-2″ or $\log_2$) are also common, showing up in many areas of computer science and numerical analysis.

In this case, we’re using the natural logarithm $\log_e$, which can be written $\log$ or $\ln$ depending on which textbook or website you’re reading. I’m using $\ln$ because it’s unambiguous; some books will use $\log$ to represent the common log and others will use it to represent the natural log, but nobody uses $\ln$ to represent anything but the natural log.

To figure out how this new formula behaves, let’s consider a few special cases. First, let’s consider the limit where the sum in the equation comes out to be zero, or at least very small compared to $1/c_2$. This might happen if you were generating so much healing that your maximum spike never got close to threatening your life. In other words, if your ${\rm Old\_TMI}$ was really really small. In that situation, the second term is essentially zero, and we have

$$\large {\rm Beta\_TMI} \approx c_1 \ln \left [ 1 + 0 \right ] = 0,$$

because $\ln 1 = 0$. In other words, adding one to the result of the sum before taking the log zero-bounds the metric, so that we’ll never get a negative value. This was a feature of the old formula just due to its definition, and something I sort of liked, so I wanted to keep it. It has a side effect of introducing a “knee” in the formula, the meaning of which will be clearer in a few minutes when we look at a graph.

But before we do so, I want to consider two other cases. First, let’s assume we have an encounter where we take only a single huge spike, and no damage the rest of the time. We’ll approximate this by saying that all but one element of the $MA$ array is a large negative number (indicating a large excess of healing), and that there’s one big positive element representing our huge spike. In that case, we can approximate our sum of exponentials as follows:

$$\large \sum_{i=1}^N e^{F(MA_i-1)} \approx e^{F(MA_{\rm max}-1)}.$$

Let’s also make one more assumption, which is that this spike is large enough that $c_2 e^{F(MA_{\rm max}-1)}/N \gg 1$, so that we can neglect the first term in the argument of the logarithm. If we use these assumptions in the equation for ${\rm Beta\_TMI}$ and call this the “Single-Spike” scenario, we have the following result:

$$\large {\rm Beta\_TMI_{SS}} \approx c_1\ln\left [ \frac{c_2}{N} e^{F(MA_{\rm max}-1)} \right ] = c_1\left ( \ln c_2 – \ln N \right ) + c_1 F \left ( MA_{\rm max} – 1 \right ), $$

where I’ve made use of two properties of logarithms, namely that $\log(ab)=\log(a)+\log(b)$ and that $\log(a/c) = \log(a)-\log(c)$. We can put this in a slightly more convenient form by grouping terms:

$$\large {\rm Beta\_TMI_{SS}} \approx c_1 F MA_{\rm max} + c_1 \left ( \ln c_2 – \ln N – F \right ) $$

This form vaguely resembles $y=mx+b,$ a formula you may be familiar with. And putting it in that form makes the effects of the constants $c_1$ and $c_2$ a little clearer.

We’re generally interested in how the metric scales with $MA_{\rm max}$, which is a direct measurement of maximum spike size. It’s clear from this form that ${\rm Beta\_TMI_{SS}}$ is linear in $MA_{\rm max}$, with a slope equal to $c_1 F$. So for a given filtering strength $F$, the constant $c_1$ determines how many “points” of ${\rm Beta\_TMI}$ you gain by taking a larger spike. Since $F=10$, $c_1$ is the number of points that corresponds to a spike that’s 10% of your health larger.

So if your biggest spike goes up from 130% of your health to 140% of your health, your ${\rm Beta\_TMI}$ goes up by $c_1$. Note that this isn’t a factor of $c_1$, it’s an additive amount. If you go from 130% to 150%, you’d go up by $2c_1$ rather than $c_1^2$.

This was the point of taking the logarithm of the old version of TMI. It takes a metric that scales exponentially and turns it into one that’s linear in the variable of interest, $MA_{\rm max}$. If done right, this should keep the numbers “reasonable,” insofar as you shouldn’t get a TMI that suddenly jumps by 2 or 3 orders of magnitude by tweaking one thing. The downside is that it masks the actual danger – your score doesn’t go up by a factor of X to indicate that something is X times as dangerous.

Once you have $F$ and $c_1$, the remaining constant $c_2$ controls your y-intercept, and is essentially a way to add a constant amount to the entire curve. It doesn’t affect the slope of the result, it just raises or lowers all TMI values by $\approx c_1 \ln c_2$.

The other case I want to consider before going forward is one in which you’re taking uniform damage. In other words, every element of $MA$ is the same, and equal to $MA_{\rm max}$. In that case, the sum becomes

$$\large \sum_{i=1}^N e^{F(MA_i-1)} = \sum_{i=1}^N e^{F(MA_{\rm max}-1)} = Ne^{F(MA_{\rm max}-1)}.$$

In this case, the $N$’s cancel and we have

$$\large {\rm Beta\_TMI_{UF}} = c_1 \ln \left [ 1 + c_2 e^{F(MA_{\rm max}-1)} \right ]$$

If we make the same assumption that the second term in brackets is much larger than one, this is approximately

$$\large {\rm Beta\_TMI_{UF}}\approx c_1\ln c_2 + c_1 \left [ F (MA_{\rm max}-1)\right ],$$

or in $y=mx+b$ form:

$$\large {\rm Beta\_TMI_{UF}} \approx c_1 F MA_{\rm max} + c_1 (\ln c_2 – F ).$$

The difference between the uniform case and the single-spike case is just a constant offset of $c_1 \ln N$. So we get all the same behavior as the single-spike case, just with a slightly higher number. The uniform and single-spike cases are the extremes, so we expect real combat data to fall somewhere in-between them.

On a graph, this would look something like the following:

Simulated TMI data using the Beta_TMI formula. Red is the uniform damage case, blue is the single-spike case, and green is pseudo-random combat data.

Simulated TMI data using the Beta_TMI formula. Red is the uniform damage case, blue is the single-spike case, and green is pseudo-random combat data.

This is a plot of ${\rm Beta\_TMI}$ against $MA_{\rm max}$ for some simulated data that shows how the new metric behaves as you crank up the maximum spike the player takes. The red curve is what we get in the uniform case, where every element of $MA$ is identical. The blue curve is the single-spike case, where we only have one large element in $MA$. The green dots are fake combat data, in which each attack can be randomly avoided or blocked to introduce variance.

The first thing to note is that when $MA_{\rm max}$ is very large, the blue and red curves are both linear, as advertised. Likewise, the green dots always fall between those two curves, though they tend to cluster near the single-spike line. In real combat, you’re going to avoid or block a fair number of attacks, and the randomness of those processes eliminates the majority of cases where you take four full hits in a 6-second window.

You can also see the “knee” in the graph I was talking about earlier. At an $MA_{\rm max}$ of around 0.6, the blue curve starts, well, curving. It’s no longer linear, because we’ve transitioned into a regime where the “1+” is no longer negligible, and we can’t ignore it. The red curve has a similar knee, but it occurs closer to zero (as intended, based on the choice of $c_2$). As you get closer to the knee, the metric shallows out, meaning that changes in spike size have less of an effect on the result. This makes some intuitive sense, in that it’s not as useful to reduce spikes that are already below the danger threshold.

The constants $c_1$ and $c_2$ were chosen mostly by tweaking this graph. I wanted the values to be “reasonable,” so I was aiming for values between around 1000 and 10000. The basic idea was that if you were taking 100% of your health in damage, your TMI value would fall between about 2000 and 2500, and then scale up (or down) from there in increments of 500 for every 10% of health increase in maximum spike size.

So that’s the beta version of the metric. Now let’s look at the results of the beta test, and see why I decided to go back to the drawing board instead of rubber-stamping ${\rm Beta\_TMI}$.

Beta Test Results

The spreadsheet containing the data is now public, and you can access it at this link, though I’ve embedded it below:

The data we have doesn’t include the moving average arrays used to generate the TMI value, so we can’t make a plot like the one I have above. We can generate a lot of other graphs, though, and trust me, I did. I plotted more or less everything that I thought could give me relevant information about its performance. I could show you histograms and scatter plots that break down the submissions by average ilvl, TMI, Boss, class, stat weight. But while I had to sift through all of those graphs, I’m not sure it’s a productive use of time to dissect each of them here.

Instead, let’s look at a few of the more significant ones. First, let’s look at Beta_TMI vs ilvl for all classes against the T16N10 boss:

Beta_TMI vs. ilvl for all classes, T16N10 boss.

Beta_TMI vs. ilvl for all classes, T16N10 boss.

The T16N10 boss had the highest response rate from all classes. The general trend here is obvious – as ilvl goes up, Beta_TMI goes down indicating that you’re more survivable against this boss. Working as intended. The range of values isn’t all that surprising given that not all of the Simcraft modules are as well-refined as the paladin and warrior ones. But at least on this plot the metric appears to be working fine.

If we want to see how a single class looks against different bosses, we can. For example, for warriors:

Beta_TMI vs. ilvl for warriors, all bosses.

Beta_TMI vs. ilvl for warriors, all bosses.

Again, the trends are pretty clear. Improving gear reduces TMI, as it should. Some of these data points come from players that tested against several different bosses in the same gear set, and those also give the expected result – crank up the boss, and the TMI goes up.

Another neat advantage is that the statistics finally work well. In other words, if you ran a simulation for 25k iterations before, you’d get an ${\rm Old\_TMI}$ distribution plot that looked like this:

TMI distribution using the old definition of TMI.

TMI distribution using the old definition of TMI.

And this was an ideal case. It was far more common to have a really huge maximum spike, such that the entire distribution was basically one bin at the extreme end of the plot. It also meant that the metrics Simulationcraft reported (like “TMI Error”) were basically meaningless. However, with the Beta_TMI definition, that same plot looks like this:

TMI distribution generated using the Beta_TMI definition.

TMI distribution generated using the Beta_TMI definition.

This looks a whole lot more like a normal distribution, and as a result works much more seamlessly with the standard error metrics we’re used to using and reporting.

So on the surface, this all appears to be working well. Unfortunately, when we look at stat weights, we run into some trouble. Because a lot of them looked something like this:

Example stat weights generated using Beta_TMI

Example stat weights generated using Beta_TMI

The problem here should be obvious, in that this doesn’t tell us a whole lot about how these stats are performing. Rounding to two decimal places means we lose a lot of granularity.

Now, to be fair, they aren’t all this bad. This tended to happen more frequently with players that were overgeared for the boss they were simming. In other words, on players that were nearing the “knee” in the graph. But enough of the stat weights turned out like this for me to consider it a legitimate problem.

Note that Simcraft only rounds the stat weights for the plot and tables. Internally, it keeps much higher precision. As a result, the normalized stat weights looked fairly good. But by default, it plots the un-normalized ones.

I could fix this by forcing SimC to plot normalized stat weights if it’s scaling over TMI, but this comes into conflict with goal #2. Ideally, I’d like it to work well with the defaults, so that I don’t have to add little bits of code all over SimC just to get it to work at all.

And more to the point, this is a more pervasive issue. If health is really doubling in Warlords, and the healing model really is changing, we may start caring about smaller spikes than before. It isn’t good for the metric to be muting the stat weights in those regions.

In fact, now seems like as good a time as any to go through our checklist and grade this version of the metric. So let’s do that.

  1. Accurately representing danger: Pass. At least insofar as more dangerous spikes give a higher TMI, it’s doing its job. We could debate whether the linear scaling is truly representative (and Meloree and I have), but the fact of the matter is that we tried that with version 1, and it led to confusion rather than clarity. So linear it is.

  2. Work seamlessly: Eh…. There’s nothing in SimC that prevents it from working, and it’s vastly improved in this category compared to the first version of the metric because the default statistical analysis tools work on it. But the stat weights really need to be fixed one way or another, which either means tweaking SimC to treat my metric as a special snowflake, or changing the metric. Not super happy about that, so it’s on the border between passing and failing. If I were assigning letter grades, it would be a C-. The original metric would flat-out fail this category.

  3. Generate useful stat weights: Eh…. Again, it’s generating numeric stat weights that work, but only after you normalize them. I’m not sure if the fault really lies in this category, but at the same time if the metric generated larger numbers to begin with, we wouldn’t have this problem.

  4. Useful statistics: Pass. This is one category where the new version is universally better.

  5. Easily interpreted: Fail. If someone looks at a TMI score of 4500, can they quickly figure out that it means they’re taking spikes that are around 135% to 150% of their health? Not unless they go back and look up the blog post, or have memorized that 100% of health is around 2000 TMI, and each 10% of health is about 500 TMI.

    In fact, I’d go as far as to say that this is very little improvement over the original in terms of ease of understanding. The linearity is nice, and the numbers are “reasonable,” but the link between the value and the meaning is still pretty arbitrary and vague.

  6. Numbers should be reasonable: Pass. At the very least, taking the logarithm makes the numbers easier to follow.

All in all, that scorecard isn’t very inspiring. This may be an improvement over the original in several areas, but it’s still not what I’d call a great metric. Generating nice stat weights is important, and it’s not doing a great job of that, but that could be fixed with a few workarounds. But failing at #5 is the real kicker. We rationalized that away in version 1 by treating this like a FICO score, an arbitrary number that reflects your survivability. But the more time I spent trying to refine the metric, the more certain I became that this was a fairly significant flaw.

To make TMI more useful, it needs to be more understandable. Period. And it was only after a discussion with a friend about the stat weight problem that the solution to the “understandability” problem became clear.

In Part II, I’ll discuss that solution and lay out the formal definition of the new metric, as well as some analysis of how it works and why

This entry was posted in Simulation, Tanking, Theck's Pounding Headaches, Theorycrafting and tagged , , , , , , , , , , , , . Bookmark the permalink.

7 Responses to (Re)-Building A Better Metric – Part I

  1. Dalmasca says:

    I have really enjoyed this series of posts Theck! I’m looking forward to seeing how you tackled the stat weight and “understandibility” problems!

    Thanks for all the hard work!

  2. Bindura says:

    Awesome, I’ve been looking forward to this since your crowdsourcing post. I’m also interested on your approach to turning TMI from being some what of an abstract number to somethiing more “real”.

    An idea I came up with as I read the closing paragraph (bearing in mid I have no idea on the limitations of coding in SimC) would be to have a “control” profile of an appropriately geared toon and compare TMI back to it. A value of 2 would mean you are twice as susceptible spikes as the control, or something like that.

    This would ground the metric in something tangible, but has some pretty obvious drawbacks. The first being the creation of control profiles for each boss for each tank. That’s 20 different profiles to maintain. The second is defining appropriately geared. Treckie/Sco could make a case for the gear they are wearing when they first pull a boss as they expect bosses to be tuned to be killable (just) with the gear they have when they first reach it.

    Umm yeah I’ll stop there, thats how I would have gone about making TMI understable, looking forward to seeing how you did it.

  3. Emrus says:

    Thank you very much for putting in all this work. I hope I wont make too much a fool of myself by replying before you have posted all parts. Also, I really hope my attempted use of math tags work.

    It seems to me that the constant C is designed only to adjust Old_TMI into a prefered range. But, now where we use the logarithm to produce the beta_TMI, the C constant is no longer necessary in its current form. However, I find it interesting that you define [math]c_2[/math] as [math]e^F[/math] as this will essencially greatly simplify the equation and the results by getting rid of the -1 term at the moving average.

    Thus using [math]c_2[/math] as [math]e^F[/math]

    [math]\large {\rm Beta\_TMI} = c_1 \ln \left [ 1 + \frac{1}{N} \sum_{i=1}^N e^{FMA_i} \right ][/math]

    Secondly, your single spike and uniform TMIs both become very neat. Respectedly;

    [math]\large {\rm Beta\_TMI_{SS}}\approx c_1 F MA_{\rm max} – c_1 \ln N[/math]

    [math]\large {\rm Beta\_TMI_{SS}}\approx c_1 F MA_{\rm max}[/math]

    The final part I wanted to discuss is how to give the TMI meaning by giving the constants meaning.

    I believe the still somewhat arbitary feel of the Beta_TMI comes from the arbitary nature by which [math]c_1[/math] was selected. Assuming [math] e^FMA / N >> 1[/math] then [math]c_1 F[/math] is increment in TMI from taking 100% more damage. I would suggest corrolating [math]c_1 * F[/math] with the 100% by defining [/math]c_1 * F = 100[/math]. In this way improving (lowering) ones TMI_suggest by 1 would reduce the spikeness by an equivalent of taking 1% less damage. (Actually 0.99% since it’s 1-1/1.01)

    This would naturally only make the unnormalized scale factors reported by simcraft even smaller. However, that’s fine in my opinion the scale factor report how much 1 more stat would give, and in terms of survivability prior to the statsquish the benefit of a single stat point is really that small.

  4. Jackinthegreen says:

    And to throw another wrench into the works of simulating stats, the current multi-strike passive for prot pallies is based on incoming heals: http://wod.wowhead.com/spell=159374

    Then again, the other MS passives are based on auto attacks: http://wod.wowhead.com/spell=159232 for example. It’ll certainly be interesting to see the interaction between haste and MS on that front.

  5. emruseliavery says:

    It seems I did indeed mess up with the tags. I cannot find an option to delete my original reply. Is that possible for you? Well here goes second attempt. I do hope that the latex tags aren’t author only, otherwise this will truly look spammy.

    It seems to me that the constant C is designed only to adjust Old_TMI into a preferred range. But, now where we use the logarithm to produce the beta_TMI, the C constant is no longer necessary in its current form. However, I find it interesting that you define $$c_2$$ as $$e^F$$ as this will essentially greatly simplify the equation and the results by getting rid of the -1 term at the moving average.

    Thus using $$c_2$$ as $$e^F$$

    $$\large {\rm Beta\_TMI} = c_1 \ln \left [ 1 + \frac{1}{N} \sum_{i=1}^N e^{FMA_i} \right ]$$

    Secondly, your single spike and uniform TMIs both become very neat. Respectively;

    $$\large {\rm Beta\_TMI_{SS}}\approx c_1 F MA_{\rm max} – c_1 \ln N$$

    $$\large {\rm Beta\_TMI_{SS}}\approx c_1 F MA_{\rm max}$$

    The final part I wanted to discuss is how to give the TMI meaning by giving the constants meaning.

    I believe the still somewhat arbitrary feel of the Beta_TMI comes from the arbitrary nature by which $$c_1$$ was selected. Assuming $$ e^FMA / N >> 1$$ then $$c_1 F$$ is increment in TMI from taking 100% more damage. I would suggest correlating $$c_1 * F$$ with the 100% by defining $$c_1 * F = 100$$. In this way improving (lowering) ones TMI_suggest by 1 would reduce the spikiness by an equivalent of taking 1% less damage. (Actually 0.99% since it’s 1-1/1.01)

    This would naturally only make the unnormalized scale factors reported by simcraft even smaller. However, that’s fine in my opinion the scale factor report how much 1 more stat would give and in terms of survivability prior to the stat squish, the benefit of a single stat point is really that small.

  6. Çapncrunch says:

    I have a feeling the “solution” to understandability is going to involve converting the TMI value to be representative of the average (probably not the best word) spike size taken. ie instead of taking spikes equal to ~100% of your health resulting in a TMI of 2000 it would give a TMI of 100, and instead of a 10% increase in spike damage increase TMI by 500 it would increase it by 10. As that is the only way I can imagine making the TMI results anything other than abstract (I’m pretty sure this is what emruseliavery is also saying above).

    However, I personally think that the “understandability” condition you set for the metric is perhaps a bit too harsh. All metrics require some sort of foreknowledge to understand their meaning. You always need to know what the metric is measuring, and one or more points of reference, otherwise the numbers you get will always be indecipherable. No Exceptions. If you see someone saying that they have a rope that’s 5 meters long, the only reason you have any idea at all what that means is because you already know what “length” is, and you’re already familiar with “meters” as a unit of measuring it. But if you didn’t know these things then that statement would make about as much sense to you as saying that the rope is 83 biblets.

    Even if you look at DPS, if you are someone completely knew, you’ve never seen or heard anything about dps in WoW and are told that you are doing 50k dps, that information will mean absolutely nothing to you. Even if you surmise (or perhaps know from another game where the term is used) that dps stands for damage per second, you’d still have no idea whether or not 50k was good or bad or by how much. In fact, you right now have no idea whether 50k is good or bad because this hypothetical could be about someone who is still leveling, and without *that* knowledge the number is useless even to those who are familiar with the metric of dps.

    I’m not saying that one metric can’t be easier to understand than another or that there’s no value in making it more understandable. I’m just saying that no matter what, nobody is going to be able to just sim a tank and automatically know what the TMI number means without already knowing “something” about TMI in the first place.

Leave a Reply