# Theck-Meloree Index Standard Reference Document

Current version: 2.0, updated 4/13/2014

I. Background

The Theck-Meloree Index is a tanking metric that attempts to quantify the “smoothness” of a tank’s damage intake based on spike magnitude and frequency.  Its fundamental assumption is that damage spikes become exponentially more dangerous (and thus more important) as they grow in size.

The first version of the metric, developed in a series of blog posts starting here, was more abstract and similar in concept to a FICO score.  As of version 2.0, it has been revamped to be more tangible and easier to understand. Now, your TMI (in thousands) is roughly equivalent to the size of your largest damage spikes (in percent of health), such that a TMI of 88k means you were taking spikes up to about 88% of your health in size.

TMI incorporates both spike magnitude and frequency. As a result, taking the same spike more often will cause TMI to go up slightly. A TMI of 88k could be a single spike of about 88% of your health or several spikes of about 80% of your health. The key is that it gives you a rough estimate of the large spikes you should be expecting to have to deal with, and does so in a numerically-robust way that produces good stat weight values.

This document establishes several standards regarding the Theck-Meloree Index, including the method of calculation, boss configurations, and ability usage.  Note that the index is sufficiently versatile to handle a great variety of situations outside of what is recommended in this reference.  However, having clearly-defined standards facilitates comparisons between different gear sets, action priority queues, talents, glyphs, and even different tanking classes.

Users that are more interested in what TMI means than they are in the method of calculation are encouraged to read this blog post, which goes into more depth on that subject.

II. Method of Calculation

Assume we are provided with a timeline $D$ representing tank health changes.  This timeline may in the general case include damage and healing events – see section VII for details.  This timeline is a simple array with time bins of width $\Delta t$, such that all events occurring between time $t$ and $t+\Delta t$ are merged into one bin.  Thus, for a string of attacks doing 30, 20, 5, 10, and 50 damage occuring at 1.5-second intervals starting at $t=0.00$, and choosing time bins of $\Delta t = 1$ second, $D$ will have the form

 30 20 0 5 10 0 50

From $D$, the Theck-Meloree Index is calculated as follows:

1) Normalize $D$ by the player’s maximum health $H$.  If the player’s health is constant, this simply means that we divide every element of $D$ by a constant $H$ equal to their maximum health.  If the player benefits from temporary health-gain effects such as Last Stand, then $H$ will be an array containing their maximum health at each time corresponding to the same element in $D$.  Thus, the normalized damage timeline $\overline{D}$ contains elements:

$$\large \overline{D}_i = D_i / H_i$$

2) Calculate the $T$-second moving average array of damage taken for the entire simulation length $L$ (also expressed in seconds).  This step can then be formally expressed for the $i^{th}$ element of the resulting moving average array as:

$$\large {\rm MA}_i = \sum_{j=1}^{T / \Delta t} \overline{D}_{i+j-1}$$

which produces an array ${\rm MA}$ of length $N=(L – T)/\Delta t$.  It is also acceptable to use an apodized moving average (i.e. by zero-padding the damage array) that produces an array of length $N=L/\Delta t$

3)  TMI is calculated from the moving average array as follows:

$$\Large {\rm TMI} = 10^4 \ln \left [ \frac{N_0}{N} \sum_{i=1}^N e^{10\times {\rm MA}_i} \right ]$$

where $N_0=450/\Delta t$ is the “default” array size corresponding to a fight length of 450 seconds.

The resulting value ${\rm TMI}$ is the Theck-Meloree Index.

III. Continuous Limit

Since we will generally be computing this index in a computer, it will be uncommon to work in the continuous limit.  Nonetheless, for completeness we include the definition here in case it becomes relevant.

For continuous functions $D(t)$ and $H(t)$ defined over the region $[0, L]$ representing damage taken by the tank and player health at each instant $t$, respectively, the Theck-Meloree Index is calculated as

$${\rm TMI }= 10^4 \ln \left [ \frac{L_0}{L} \int_{0}^{L} e^{10\times {\rm MA}(t)} dt \right ]$$

with the continuous $T$-second moving average damage function

$${\rm MA}(t) = \int_{0}^{L} {\rm rect}(t-t’,T) D(t)/H(t) dt’.$$

Here ${\rm rect}(t,T)$ is the usual rectangular function of width $T$, such that

$${\rm rect}(t,T) = \cases{ 1 & if 0 \leq t < T \cr 0 & otherwise}$$

IV. Nomenclature

The acronym “TMI” will generally be used instead of the longer “Theck-Meloree Index” for brevity.

The standard allows for a user-defined window length $T$ used to calculate the moving average function. Unless otherwise specified, the term TMI will properly refer to the metric as calculated using a 6-second moving average (i.e. $T = 6$).

When referring to the metric as calculated using a different $T$, it should be appropriately noted by appending the suffix “-T“.  Thus, TMI-9 would clearly indicate a TMI calculated using a 9-second moving average, while TMI-3 would refer to one calculated using a 3-second moving average.  Consequently, TMI-6 is synonymous with TMI.

V. Standard Bosses

While many factors affect TMI values, none affects the result as sensitively as the boss with which the metric is calculated.  As such, we provide standardized bosses for each level of content to facilitate fair comparisons between different configurations.

The standard boss definition is comprised of two actions:

1. A melee attack with a swing timer of 1.5 seconds
2. An instant damage-over-time spell that inflicts periodic magic damage every 2 seconds for 30 seconds.

In Simulationcraft, these two actions can be produced with the following lines of code with X and Y substituted appropriately:

enemy=TMI_Standard_Boss level=93 role=tank position=front actions.precombat=snapshot_stats actions=auto_attack,damage=X,attack_speed=1.5 actions+=/spell_dot,damage=Y,tick_time=2,num_ticks=15,aoe_tanks=1,if=!ticking

The table below provides the standardized values of X and Y for a boss approximating a particular content level

Content Level Raw auto-attack damage (X) Raw DoT tick damage (Y)
T15LFR 550000 27500
T15N 750000 37500
T15H 900000 450000
T16N10 1250000 1550000
T16N25 1550000 77500
1T16H10 900000 95000
T16H25 2300000 115000
T17Q 3000000 150000

For simplicity, any of these bosses can be invoked in Simcraft with the option tmi_boss. The syntax to invoke the T15H boss would be:

tmi_boss=T15H

Note also that when using the GUI (i.e. Simulationcraft.exe), these bosses can be selected from a drop-down box on the Options tab.

The T15H standard produces full-sized melee hits of approximately 340k damage on a tank after specialization and armor mitigation are accounted for.  This is roughly equal to the melee damage of heroic Lei Shen on the 25-man difficulty setting.  As new standard bosses are added to the table, they will generally continue to approximate the melee damage of the hardest-hitting boss in the tier while including ~7.5% magic damage through the DoT effect.

VI. Player composition

The goal of TMI is to calculate the threat of spike damage to the player.  Historically, this has been assessed by examining the player’s vulnerability to melee attacks in the absence of cooldown assistance.  The rationale behind this is that the player is very unlikely to be at serious risk of death while proactively using cooldowns such as Guardian of the Ancient Kings or Divine Protection.

However, there are several cooldowns that could be considered rotational rather than reactive. Holy Avenger is one such cooldown, as it is frequently used on cooldown for a damage boost rather than saved for a particularly dangerous segment of a fight.  In addition, it competes with two other talents that are rotational in nature.

To further complicate matters, it would be very difficult to remove or account for such effects when calculating TMI for a combat log from an actual encounter.

As such, the TMI standard imposes no specific constraints on ability usage or player configuration. In the unlikely event that a particular ability appears to be excessively problematic for some reason, the standard will be updated appropriately to prohibit that ability.

Note, however, that as a player attempting to simulate your character in Simulationcraft, it is often to your advantage not to include non-rotational damage reduction cooldowns (ex: Shield Wall, Guardian of Ancient Kings, etc.) in your simulations. TMI preferentially filters to weight the highest spikes as more important than everything else, and those spikes are unlikely to occur while you have a 50% damage reduction cooldown running. So while running cooldowns will reduce your average TMI, it won’t give you as accurate a picture of how vulnerable your character is during periods not covered by cooldowns. Since most players are interested in optimizing around the dangerous periods rather than the guaranteed-safe periods, we recommend not using any cooldowns while simming that aren’t part of your standard rotation.

VII. Group composition and healing

TMI is very robust, and can be calculated for a player acting alone or for a player accompanied by any number of other players, including healers.  The TMI result, which roughly approximates maximum spike damage taken in any $T$-second period, is a useful and intuitive measurement in any of those situations.

However, most tanks are more interested in what they can personally do to improve their survivability. Simulations that include healers tend to mask the effects of changes made by the tank (for example, while calculating stat weights) since newly-created vulnerabilities can be partially mitigated by excess healer throughput. While this may also provide useful/interesting information, it ultimately makes the results more complicated to interpret correctly.

As such, we define the standard process of calculating TMI to include only effects directly caused by the tank and the boss.

In other words, while one or more healers may be present for the simulation, any healing or absorption effects caused by that healer must be ignored for the calculation of TMI.  The easiest way to do this in Simcraft is to sim the tank alone against the boss (the simulation does not end when the tank drops below zero health, it just lets their health continue to go negative). For cases where a healer is present in the simulation, enabling the player option tmi_self_only=1 will tell the simulation to ignore external healing in the calculation of TMI and treat any self-overhealing done by the tank as effective healing.

If TMI is being calculated from a combat log, this is more difficult to accomplish. Furthermore, there may be circumstances in which it is interesting to assess a tank’s TMI while including all sources of healing. In such cases, the metric should be reported as “ETMI” or “Effective TMI” to indicate that the calculation is including external sources of healing.

While effects such as Heroism/Bloodlust will certainly be present in real encounters, these are generally times of increased healer throughput and active mitigation uptime, both of which reduce the danger of those periods.  Since the goal of TMI is to assess the danger of damage spikes during dangerous periods, we strongly advise the user to perform TMI simulations without the benefit of Heroism, Bloodlust, or any other temporary buff effect that the tank cannot provide for himself or herself.  This includes external mitigation effects such as Hand of Sacrifice and Barkskin.

However, since these sorts of effects would be impossible to ignore (or in some cases, detect) in a combat log, they are not strictly excluded from the standard TMI specification. As one would expect, they are all assumed to be included in an ETMI calculation.

Passive raid buffs, such as Arcane Intellect or Power Word: Fortitude, are likely to be active during the entire encounter, and are thus allowed in the standard TMI calculation (and in fact expected, as Simulationcraft turns these buffs on by default).

VIII. Changelog

4/13/2014: version 2.0

• Instituted new formula for TMI
• Updated TMI standard boss list
• Heavily revised specifications to make combat log calculations viable

8/13/2013: version 1.2

• Health normalization now performed using instantaneous health and applied immediately to the damage timeline (i.e. before the moving average) to better account for effects like Last Stand and similar trinket effects.

8/3/2013: version 1.1.

• Removed the restriction on damage reduction cooldowns.  Everything is fair game now.

### 42 Responses to Theck-Meloree Index Standard Reference Document

1. Geodew says:

“As such, the TMI standard allows the use of abilities which have at maximum a two-minute cooldown when used rotationally. This means that the ability may not be restricted by a health conditional (such as “if=health.percent<30″)."

I don't think this plays very nicely with other tank specs (Expel Harm, Death Strike, Rune Tap, Frenzied Regen, etc). It also won't play nicely with checking the usefulness of WoG, since it would be idiotic to cast WoG at full hp. How are you planning on doing the Shift SotR rotation with this limitation? Last boss hit? Is it reasonable to expect all tanks to viably represent their rotations doing that?

• Theck says:

I plan on implementing SH* with a check on last boss hit or damage done in the last 1.5-3 seconds, yeah. One of the problems with using percent health conditionals is that in many cases you need a healer in the sim, which complicates matters.

It will be fairly easy to put an option in to ignore external heals for the TMI calculation, but ignoring external absorbs is much trickier. And without a healer, your health plummets into negatives, so any “if=health.percent

• Geodew says:

Ah, it makes sense to want TMI to be independent of healers, I see what you’re doing now. And there’s no way to model current health percentage well without knowing what kind of heals you might be getting and how distracted your healers are etc etc.

Well it seems that we need to explicitly define what a “cooldown” is, then. Is AMS a cooldown, with a cd of only 45s? Vampiric Blood, with only 1m? Elusive Brew, which is the whole reason Monk tanks like crit, and doesn’t really have a “cooldown” but acquires charges over time from autoattack crits?

Also, are you saying then that a “recent damage taken” conditional is permissible, but not “health percent?” I think that makes sense. The thing is that if tanks have more cooldowns, we need to account for that. And a tank with spikier damage intake probably would need to blow a cooldown more often, which might even out with having more cds, unless you were using them at the wrong time. Thus, I’m just not sure it’s good to use cooldowns on-cooldown (or even on-cd while no other cd buff is up) since it adds another unrealistic element to the sim. You wouldn’t hit Ardent Defender after avoiding three attacks in a row just because it came off cooldown, so why sim it that way when we have a decent alternative? Not sure I get your logic.

• Geodew says:

Woops, the end sounded arrogant. Was supposed to be ‘confused and interested’ for the record. 😛

• Theck says:

Once it’s implemented, yes, i think a “recent damage” conditional is acceptable. You’d need something like that for Death Strike to be effective, and it seems like a reasonable way to approximate longer-cooldown usage.

AMS, VB, and EB are all things that would be fair game – in fact, I’d expect them to be necessary for a DK to keep up based on their play style (lots of smaller cooldowns, but spikier baseline damage patterns). I’m mostly trying to curtail the long-cooldown, high-mitigation spells like GAnK, Shield Wall, IBF, LoH, etc.

The concept was that you’re not going to die when you have one of those cooldowns up anyway, so you wouldn’t want those situations to impact your gearing. You’d want to gear for the worst-case scenario, which is when you don’t have anything up.

Of course, that runs into a problem with Holy Avenger vs. Divine Purpose, for example. You’re basically unkillable during HA, but if we disallow it then a tank specced DP would sim out much better than a tank specced HA. I’m not sure what the solution to that is, apart from disallowing cooldowns above a certain time threshold (or perhaps only allowing cooldowns that come from talents?).

I’m not wedded to the idea of requiring usage on cooldown, mind you. A “damage taken in last 5 seconds” type of conditional would probably be fine, as would mutually exclusive cooldown conditionals (i.e. only use Divine Purpose if Holy Avenger and SotR aren’t active). I’m just trying to minimize the impact of the really strong cooldowns (like HA) that we do have to include.

• Geodew says:

Hmm. Well, there may be a problem with that. Monk tanks don’t have a -50% damage taken cooldown. Fortifying Brew is -20% damage taken, +20% max hp, +20% stagger amount, *glyphed* -25% damage taken, +10% max hp, +20% stagger amount, *both* for 20 secs. Dampen Harm reduces the damage taken from the next 3 sources of damage exceeding 20% of your max HP by -50%, and falls off after 45 seconds if unused. Diffuse Magic reduces magic damage taken by 90% for 6 seconds. Then they have Elusive Brew, which is almost rotational.

Complicating matters further, Dampen Harm and Diffuse Magic are talents in the same tier as a passive that heals them when they use Fort Brew, Purifying Brew, or Elusive Brew (max once per 18 sec).

I don’t know Guardian Druid cooldowns, really. Do they have a shield-wall-like cooldown? Consider this as well.

It may be better to include all cooldowns as a result, if the goal is to have this applicable to any tank. After all, the amount of preparation a tank could do for taking damage later while having the cooldown running will vary from tank-to-tank, and that may affect gearing choices slightly. For example, (going totally just off intuition here) a Prot Paladin can pool HoPo for the end of GoAK. Both Haste and Mastery geared paladins have plenty of time to do so. However, a Haste Prot Paladin would cap HoPo sooner, and 5 HoPo is not as valuable of a state to be in as it would be for a Mastery Prot Paladin. In a sense, your haste isn’t helping as much during and just after a GoAK. Thus, GoAK is a slightly stronger cooldown for Mastery (again, just using intuition). Other tanks have similar class mechanics. DKs can pool runes, Monks can pool chi and let Chi Wave/Expel Harm cool down, Druids can pool rage, etc.

It would also be interesting and useful to see if Fortifying Brew or Dampen Harm are as good or better than a shield wall for taking melee damage spikes.

Perhaps we could sim both without and with major cooldowns?

• Geodew says:

Something to consider doing may be ignoring attack strings where a major cooldown was running for purposes of TMI calculation, but still simming that part of the fight. I don’t think this fixes everything, but it is a thing you could do.

I guess part of the reason I suggest including all cooldowns is because anything else feels a little arbitrary, and ignores the cooldown package a tank comes with. For example, Glyph of Icebound Fortitude should affect TMI, but it wouldn’t if we’re ignoring 50% cds. At the least, it would be useful to compare a DK with and without it.

• Theck says:

I don’t see much point in allowing the cooldown but ignoring attacks during that cooldown. All that does is reduce the amount of data you’re collecting.

I guess it depends on what we’re trying to accomplish. If we just want a relative “which tank takes smoother damage” comparison, you’re probably right that we should allow all cooldowns.

If you want stat weights, then you probably don’t want to use powerful cooldowns in your APL. Because you’re not in danger of dying to melee attacks while GAnK is up, so you don’t really care what your stat weights are while it’s up. You care about which stat makes you most survivable when you’re vulnerable, which is when you don’t have cooldowns active.

The other issue is that the time spent with big cooldowns up tends to reduce your accuracy. The big spikes that cause the majority of your TMI rating are stochastic. But you won’t get those spikes while GAnK is up. So if you have GAnK up for 24 seconds of a 450-second fight, you’ve effectively reduced your integration time by 24 seconds.

• Geodew says:

I’m not sure I understood your last paragraph, unless it was reiterating the first. (And isn’t it not quite stochastic, since the largest damage you can take for a TMI-6 string is 4 completely unmitigated hits? By which I mean mitigated only by Armor.) In any case, I don’t think it would simply reduce sample size. The point of not including the strings during a cooldown but still using them would be to see how your stat weights affect mitigating damage *around* your cooldown usage, like the haste vs. mastery prot paladin ending GoAK with 5 HoPo. In some sense, mastery thus increases the value of GoAK, and I’m not sure that should be TOTALLY ignored, though certainly the stat weights without cooldowns running sounds more valuable. There are probably other tank cooldowns which might change stat weights,

Also, if that weren’t enough of an argument, I was thinking of a potential situation like this:

Hypothetical Gear strategy 1: Lots of small-ish spikes
Hypothetical Gear strategy 2: A few very big spikes, but damage intake normally very low (just pretend this can happen for a moment)

Gear Strategy 1 may end up with lower TMI if Strat 2’s spikes are very high. However, if most of those large spikes can be covered by a cooldown like GoAK since they’re few and far between, that may result in Strat 2’s TMI being lower post-cooldown, since we’re not looking at attack strings that occur while a cooldown is running.

Aside: If you do end up doing this, judging where Holy Avenger “ends” sounds a bit tricky, since according to my understanding, you build up a large chunk of SotR uptime during the cooldown’s duration. Something to consider.

————-

I agree with everything else you said, though. I think a rule of thumb may be difficult to construct. What about the two-number metric I proposed? One calculation with and one without major cooldowns?

The “with” would allow us to compare talents like Holy Avenger vs. Divine Purpose and compare e.g. BiS Blood vs. BiS Guardian. This would be to give an idea of how smooth your damage actually is once you’re using your full arsenal.

The “without” would permit the arbitrary definition of “major cooldowns” to be determined on a per-spec basis, and stat weights would be constructed from this number. Whether we use the “ignore time cooldowns are up” or “don’t use cooldowns at all” method, they should be mostly accurate. We could even ignore strong talents like Holy Avenger for this part without worrying about giving the impression that Divine Purpose is better just because HA is never used with this model, but is included in the “with” calculations.

The obvious drawback is that this requires double simulation time since this uses two different priority lists, unless you use the “ignore time cooldowns are up” method, in which case you could sim once and then do the math twice. A related drawback is that this requires potentially two action priority lists, or at the least a flag for action lines that says “ignore this line for without-cooldown-TMI calculation.”

• Theck says:

There are certainly small edge effects (like ending GAnK with 5 HP), but they’re exactly that: small. In practice, major cooldowns just eliminate that duration from your useful data.

To put it another way: because TMI is exponential, it is heavily dominated by the top 5-10 events in an iteration. So what you’ll find is that if an event is large but rare, it will only show up in some iterations and not others. That’s why the TMI breakdown plot tends to have a fairly large range. There are a few iterations that permit really big spikes, and those significantly bring up the average TMI.

Now consider what happens if you cover 20% of your iteration with major cooldowns. During that 20%, those spikes essentially can’t happen, because they’re blunted enough that they don’t generate a large TMI value. Thus, if you have a few rare (but possible) events, there’s a greater likelihood that they’re invisible to TMI, which means fewer iterations that have very large spikes.

This is, essentially, a loss of integration time if you’re concerned with getting a reliable value. If you take multiple sets of data with 100 iterations each, you’ll see a larger variance in the reported TMI if you allow cooldowns than if you don’t.

As far as having two standards, I don’t know that there’s a lot of benefit to be had. Remember that I’m not specifying the “only true way” to calculate TMI, just offering a standard so that it’s easier to compare values. It’s fine to tailor your action priority list to generate a TMI value for a specific situation.

E.g. I’m defining inches, but measuring in feet or yards still gives an accurate length measure. It’s just trickier to compare to measurements made in a different unit.

I suspect the easiest standard will be to just allow everything, so that players can try to push the TMI as low as possible. But I’ll probably have to write a blog post to point out why this isn’t the ideal case for generating stat weights, and demonstrate how to do that appropriately.

• Geodew says:

Ah, I thought you *were* trying to specify an explicit “true way,” so that a Blood DK’s TMI of 2500 means he could definitively say he takes slightly smoother damage than a Guardian Druid’s TMI of 2700. Otherwise, you could only say their damage smoothness is “kind of similar.” If you’re okay with the specs’ TMI being slightly different, that’s fine; it’s a very useful tool regardless, but having a tool which can be used as a more reliable factor for Blizz to do their tank balancing, besides just DTPS and player feedback, would be amazing for the game’s health, I think! Not to mention really awesome. (No, I don’t work for Blizz. 😛 I kind of heavily suspect they read your blog, though. Haha.) That’s just why I was pushing for a more explicit game-wide universal standard.

Having said that, you’re probably right that “allow all cooldowns” is not useful for stat weight generation. I see what you’re getting at about those rarest events. If it occurs 10 times in a 10k minute simulation without cooldowns, then it can occur *when you have no cooldowns left* by happenstance, like the “Ha” gear set missing 8 HoPo generators in a row and then taking a string of 4 unmitigated attacks, not any covered by SotR or even dodged, parried, or blocked. Thus, gearing should definitely take this into account. Especially if we’re popping cooldowns in anticipation of a large spike we’re afraid will faceplant us in the action list, you lose a lot of your spike data and would have to sim for a MUCH longer time to see that spike happen when you don’t have any cds left, not just 20% longer, but maybe 100% or 200% longer, and I could be low-balling it.

That also means that simming with cooldowns might require a longer simulation time to generate a sufficiently statistically precise TMI value, and it may be too long to be practical. That’s what you’re getting at, I guess?

If you allow cds in the prioirty list, perhaps the simmer could run in “True TMI” or “TMI Stat Weight Generation” modes to avoid generating two numbers every time and keep simulation time down for Average SimC User? 😛

2. aggixx says:

In section 5 you’re missing a slash before the action “spell_dot” and a new line before “level=93″.

• Theck says:

Thanks, fixed.

3. Is the continuous TMI expression having a wrong factor 1000 instead of 10000 in the discrete form?

• Theck says:

Yup, typo. Fixing it now.

• Have you considered about the TMI of a scenario that a tank is not taking any damage?

At that condition, the TMI of him will not be zero, since sum of exp() will never be zero unless you are healing at an infinite HPS. In the term of smoothness, a positive TMI may imply some “un-smoothness” for people. Is there any plan to bias it or limit it to make small or zero DTPS tank better accessed?

• Theck says:

Not especially. A tank taking exactly 0 damage (and 0 self-healing) would have a TMI of $\frac{10000 \times 6^2}{3^{10}} = 6.1$. I could obviously normalize this to be 0 several different ways (subtract 6.1 from the result, for example, or round the result of $e^{-10 \log 3 }$ to 4 or fewer digits of precision), but I’m not sure it’s worth bothering for a case that’s essentially trivial. If you’re not taking damage, your TMI is irrelevant.

• Impressive.

I’ve posted a translation of this document to NGA, one of the most active WoW forum in China (http://bbs.ngacn.cc/read.php?tid=6422084). Since theorycrafting is not a attractive topic for most people, discussion on this thread is not hot. If there is any important point appeared in the thread, I’ll try to let you know.

• Lucas says:

I have just read through your entire postings regarding TMI. It is an interesting solution to a common question in WoW. To give you my background, I have a degree is physics with a phd in statistics. You seem to be reinventing the wheel just a little bit here. You are doing a lot of work to measure the tail end of your probability density function. Fortunately this is a really common issue in applied statistics. So we have an elegant solution for it. Gamma distributions scale and shape parameters can give you perfectly elegant solution to the question. This is also a much more tested measure of “tail size”.

• Theck says:

Unfortunately, the Gamma Distribution doesn’t quite cut it for us. First, it relies on the assumption that the Gamma Distribution p.d.f. can accurately represent all possible TMI distributions, which is patently false. I can, with some work, design an experiment that generates a “double-humped” TMI distribution that has more than one local maximum. The Gamma Distribution would be a poor estimation to that TMI distribution, and thus give inaccurate results.

Second, it doesn’t include the proper exponential weighting function. We could do that after-the-fact, of course – i.e. generate our Gamma Function estimate to the real TMI p.d.f., and then perform the weighted average on that (much like the histogram version of the metric, if you read back through the three explanatory posts from July). But again, in that situation the Gamma Distribution is just an approximation to the true TMI p.d.f, so it’s not clear to me that we gain anything from adding that layer of complexity.

We’d inevitably have to determine the scale and shape parameters of the Gamma Distribution empirically based on the simulation results, which essentially means performing a full analysis of the “true” generated TMI sample distribution. At that point, it’s just faster and more accurate to use the generated sample distribution to do our calculations.

Those two concerns together raise a more problematic concern: Let’s say we use the Gamma distribution approximation by performing some analysis on a sample TMI distribution. If there is a stray event at very high TMI, which may be a rare (but obviously possible) event, the Gamma estimation/fitting process will inevitably suppress that data point. Then when the exponential weighting is performed, that data point won’t contribute much to the overall TMI result, which is bad – we *want* those sorts of events to dominate the result, because those outliers are very dangerous. So you could quite easily come up with a situation where any sort of Gamma distribution fit reduces the impact of the very events we care about most.

I did some reading about tail distribution analysis while coming up with this metric, and none of the existing methods I found really suited the application. Most methods were expressly trying to estimate the tail of a known distribution that was reasonably Gaussian in nature, and treated outliers like “errors.” We want to do the exact opposite, because those outliers are the dangerous spikes we’re trying to measure. Since my Ph.D. is in Physics, I asked a few statisticians I know (including my brother, who is an actuary), but none of them could think of a method that did exactly that.

I’m sure one probably exists somewhere, but it seemed like less work to simply develop the metric I actually wanted than searching in vain for a method that may or may not exist and may or may not exactly match the sort of analysis I’m trying to perform.

• Lucas says:

Fair enough. I didn’t do any real analysis with data. I just looked at your PDF and it looked like such a beautiful gamma function. High theta means high variance, poor “smoothness”, and a more spikes. Low theta means nice constant smooth damage. But if you can generate other distributions of 6 sec damage intake, that would be an insurmountable issue. Regardless, your concept of looking at the 6 second health delta PDF is really spot on for what’s at the heart of tanking.

4. Lucas says:

I did some parametric modeling for some of your summary data available in your blog (the 0.1 buckets from 0.5 to 1.4). Table 2 from “Making of a Statistic:part 1.”
The variances for the sets were a follows…
Control Haste: 0.011134
+1000stam: 0.01008
-1000Exp: 0.012391
+1000Haste: 0.010449
+1000Dodge: 0.010904
I didn’t analyze the mastery or hit or parry sets.
Exp>>stam>haste>dodge
The parametric results are consistent with your non-parametric TMI. The goodness of fit tests are also good. What I really need is more raw data. Unfortunately I only have your summarized data and I am not familiar wih how to generate my own 6 sec health delta distribution. Could you link raw data somehow? Or is there a way to create it from simc?

• Theck says:

You can generate that output in simc by using the tmi_output option:

# Dump TMI debug output to tmi_debug_file.csv
armory=us,illidan,john
tmi_output=tmi_debug_file.csv

http://code.google.com/p/simulationcraft/wiki/Characters#Optional

This will only generate one iteration worth of data though, so to get a lot of data you’ll either need to write a batch file to do this thousands of times to different CSV files, or run a single iteration that’s many, many minutes long (probably easiest, and closest to how I performed the MATLAB sims).

The output format should be pretty self-explanatory (it just steps through the calculation from column to column more or less, mostly for debugging purposes). If I recall correctly, the first two columns are healing intake (negative numbers) and damage intake, the third column is the sum of those two, and then the next few columns are moving averages (one is normalized to player health, the other isn’t) and finally the weighted value.

• Lucas says:

Thanks for your help. Unfortunately the debug files seem a little screwy. Average damage and healing magnitude per record is 10E7. And the normalized Heath deltas average ~6%. I may have to dust of my Matlab skills from my physics days.

But from you papers (which are well written and easy to follow) I can see that the data actually do follow a exponential family pdf (looks like normal is the best fit). In section 3 you state that your PDFs at lower N don’t follow normal pdf. However that is simply due to sample size (and overheals inflating the zero at low N). The phenomenon you see with the 4/N having lower TMI at large N is because large values on N have lower variance (the bigger the time frame the less the spikey the damage). This is in fact near indisputable evidence that the distribution is Gaussian and we just have sample size issues at lower N that prevent us from seeing it visually. But I would bet that R would classify at as a pretty stable normal distribution.

Lastly just from your equations and methodology I can see that TMI is ultimately just a non-parametric measure that is exponentially proportional to variance.

I would lean towards using the variance of a best fit Gaussian distribution. And we could use some simple stats tricks to account for the inflated zero.

And thanks for being so responsive. Reviewing your paper has been fun. Usually my stats are related to clinical trials. And i feel that this is much more important. XD

• Theck says:

I disagree with you about the PDF at lower N – it’s not a sample size issue. Consider the limiting case of N=1 (or T=1.5) for a boss that only melees with a 1.5-second swing timer. In other words, you’re considering the distribution of single boss attacks. Ignoring boss damage variance (i.e. every hit is X, every block is 0.7*X, every SotR-mitigated attack is ~0.5*X, SotR+block is ~0.5*0.7*X, etc.), you’d get a discrete distribution with a delta function at each allowable damage value. Boss damage variance would essentially convolve the nonzero part of that distribution with a rect() function, so you’d get narrow rects centered on those discrete values (except for the avoidance peak at 0, which would still be a delta).

Adding absorbs, incidental healing, and other effects certainly blurs the distribution, but it does not in any way make it Gaussian. So it’s very clear that a 1-attack moving average doesn’t look remotely Gaussian, and would therefore be very poorly served by approximating it as such.

It will become *more* Gaussian-like as the number of attacks we consider increases, but still not strictly Gaussian because of mechanics. Hit, expertise, and haste truncate the moving average distribution by suppressing large events, essentially giving you the product of a Gaussian and a Heaviside function. For evidence of that, look no further than the figures in these blog posts from the MATLAB era of sims:
http://www.sacredduty.net/2012/10/05/damage-smoothing-follow-up/
http://www.sacredduty.net/2012/12/18/damage-smoothing-for-paladins-round-three/
http://www.sacredduty.net/2012/10/02/damage-smoothing-expertise-mastery-and-haste/
(In fact, that’s why we prefer those stats so much – because they force the moving average distribution to deviate from a Gaussian by eliminating the long tail at high damage values)

That same truncation carries over into the weighted histograms. Again, consider the last 6-panel figure of part 3, which shows the weighted histogram for N=2 through N=7:
http://www.sacredduty.net/2013/07/03/the-making-of-a-metric-part-3/
http://www.sacredduty.net/wp-content/uploads/2013/06/htn_weighted_2to7.png

It’s pretty clear that the first two subplots will never be Gaussian no matter how many iterations you throw at them. Not only are we still seeing discrete peaks with only a little absorb/heal-induced jitter, but there’s a hard cap on allowable x-values. The boss literally cannot hit the player for more than x=1.85, or 185% of the player’s health, within two attacks. So again, we’ll have a Gaussian with a hard truncation at around 1.85, and in this case it’s significantly far up the tail that we can’t ignore it (it’s almost truncating at the upper half-maximum point).

The last few figures look better, but still distinctly non-Gaussian. The exponential weighting creates a second peak towards the upper end of the distribution, as evidenced by the yellow line in the N=4 plot and the final red/blue lines in the N=5 plot. Similar peaks exist in the N=6 and N=7 plots, though they’re considerably smaller. The underlying moving average distribution we’re using to generate those plots is almost Gaussian though, because the hard-truncation has been pushed farther up along the tail. The same cannot be said of the N=2 and N=3 moving average PDFs though; even with more samples, we’d see that sharp edge at the maximum possible attack size at a significant point along the distribution.

In short, TMI is essentially the sum of the exponentiation of a truncated Gaussian random variable. What I think you’re suggesting is that we can approximate it by using the exponentiation of a non-truncated GRV, and/or take that one step further and approximate a collection of TMI results as a Gamma distribution since we’re approximating it as a standard GRV-based experiment. But that only works in certain limits due to the truncation, and those differences become significant in extremes that aren’t hard to realize in practice.

I *am* interested in the basic idea of trying to fit a sample TMI distribution (i.e. as generated by SimC) to a Gamma distribution, because it might be interesting to see if the shape and scale parameters give us more information about survivability than just the simple average of that distribution we’re calculating now. But I’m not convinced that it will be more accurate, simply because it’s making too many assumptions about the underlying data.

Example: Two distributions of almost identical shape and scale parameters may differ by a single event in the extremities (i.e. a non-hit-capped set allows an extreme value where a hit-capped set does not). That’s a difference we would definitely want to preserve, because that extreme value is important – it’s why we hit cap in the first place.

• Lucas says:

And I was not suggesting the weighhted pdf is normal. I agree that is def not the case. I was suggesting that the contribution from the high values and outliers into the variance makes the weighting function unnecessary. e.g. Exp set has >10% more variance than your control haste set (using the histogram u provided in your blog).

5. Lucas says:

I see what you are saying. I was looking at it through the eyes of Cramers Theorem which would suggest that the smaller windows are in fact normal if the 7 hit PDF is normal. Other things may be going on that violate independence like absorbs. And the zeros may in fact make even the N=7 be more better suited to gamma than normal.

As for the outliers, I don’t know how big of a problem they will cause since they will increase the variance. And vice versa, increased variance increases the chance of outliers.

Ultimately it comes down to me taking the time to do accurate goodness of fit tests on lots of data.

• Theck says:

I don’t think Cramers Theorem applies here, at least not in the way you meant. We know for a fact that X and Y independently are not normal random variables (e.g. the 1-attack histogram is very clearly discrete or near-discrete and non-Gaussian). You’re using the mathematical induction that X+Y looks like a normal random variable, thus X and Y are normal random variables. But X+Y isn’t normal, strictly speaking, so that assertion doesn’t hold up.

In fact, the reverse is true: Since X and Y are non-normal, Cramer’s Theorem *guarantees* that X+Y is not normal.

The Central Limit Theorem is more applicable here. X and Y are independent random variables with well-defined mean and variance, so the sum of many iterations should be approximately normal. That’s exactly what we see as we move to the 5-, 6-, and 7-attack moving averages. Though again, it’s only approximately normal thanks to the truncation effects.

6. Lucas says:

Yes. I was eyeballing it. And if X+Y (N=7: unweighted) is not normal then N={1,2,…,n} are not normal.

However I after I read through your older blog posts and responses, I think that you agree with me but I am just not explaining myself correctly and I am missing part of what you are trying to get at with the TMI. I have been saying that the variance of the unweighted pdf might be a better measure of spikiness than the TMI when in fact that is not the case. This is where we are getting mixed up. In “Damage Smoothing: Expertise, Mastery, and Haste” you state:
“It’s interesting that set #5 (mastery) does a much better job than set #6 (haste). It gives us lower overall damage intake and roughly the same standard deviation, but significantly fewer of the “dangerous” spikes in the 80%+ and 90%+ ranges. It’s about twice as effective at getting rid of those dangerous events than the haste set.”

As a statistician, I see those two sets as equally spikey. e.g. to me, someone who takes 4 unmitigated hits every time is smooth as silk. To you, that is EXTREMELY spikey which makes sense. TMI tries to quantify the risk of he rarer/big events. I am looking at the size of the tail and you are looking at the size and location of the tail. In which case Upper 95% confidence interval might also be a useful metric which takes into account both the mean and the variance [Mean + 1.96*Standard Deviation]. But that again assumes that there is a well defined mean and variance of your unweighted pdf.

7. Pingback: Crowdsourcing TMI | Sacred Duty

8. Anon says:

You’re missing a logarithm after taking $t$ to zero.

• Anon says:

Did you mean $Delta t$?

• Anon says:

No, I meant $Delta t$

• Theck says:

You’re right, there’s an $\ln$ missing in the continuous form.

9. Pingback: Tank Metrics | Ask Mr. Robot's Blog