Is Affliction too strong?
My answer: Maybe.
If you ask our Mage friends, or our Shadow Priest friends, the answer is a resounding “Yes!” Even from unbiased sources though, Affliction damage might be overall too high:
Pretty happy with PvE overall. Arcane will be fine without Scorch spam. Aff may be too high. UH, Sub, Ele, Arms may be low.
– @GhostCrawler on Twitter
(However, it might very well be that the running joke about Ghostcrawler playing a Mage is true..)
I want to note though, that there’s a little more information gleaned from that tweet than at first glance. A closer reading into which specs Ghostcrawler thinks there’s an issue reveals a lot about what Blizzard’s design philosophy might be.
How Can We Tell? Or Raidbots – Understanding Real World Results
Whenever discussions like these come up, there are usually 2 metrics brought under scrutiny: Simulation results from SimulationCraft, the latest of which looks like this:
and RaidBots’ DPSBot, which is generated by running common statistical measures over all the public parses submitted to World of Logs. The default metric used on RaidBots is ‘Spec Score’ which performs a little black magic to reduce the effects of so-called ‘gimmick’ fights in order to correctly assess real world strength:
Both these measures have their strengths and weaknesses – a simulation is just a simulation, it doesn’t hold a candle to how each spec performs in the real world. DPSBot itself measures the real world, but we don’t know what goes into the tweaking for Spec Score. As we’ll see though, while we might be inclined to check out every individual fight on DPSBot and consider that gospel for spec strength, there are a myriad of other issues. Instead of walking through RaidBots numbers than anyone can look at just by going to the site, I’ll walk through some of the issues and caveats you should have while using RaidBots.
The Issues with RaidBots
While Spec Score might seem like a black box, Seriallos, the creator, actually has a breakdown of how it works here. Even so, there are obvious problems in using it to determine spec strength. The first such issue is that of role: On Will of the Emperor, melee have very little uptime on adds (other than Courages), so the fight for them is a single target fight. Ranged, however, get the benefit of multi-dotting and AoE-ing Rages. On the other hand, melee get to do the Opportunistic Strike dance, which gives them a huge DPS boost. But not only is there a melee/ranged disparity on the fight, one of the common tactics for most guilds in 25H is to have Mages ring of frost and CC the Rages into groups, to have Hunters soak Titan Sparks or to have Death Knights tank Strengths. So compared to a Rogue or to an Elemental Shaman, these classes have completely different roles, but yet all these numbers are rolled into Spec Score.
The second issue is that of weighting. Here’s a quick example: Fire is Spec Score 100 on Amber-Shaper Un’sok, while Shadow is Spec Score 60. Even if Shadow is 5% better on 8 other fights, the specs will seem to be equal – but would we actually consider them equal in that case?
On the side, we can also select ‘Overall DPS’, which seems to be an average over all the fights without any sort of weighting. With this metric, Affliction is only 10% ahead of Beast Mastery. While using this seems to be more within the realm of credibility, the fact is:
We should use neither.
The issues with Overall DPS are even more clear: because it does not normalize on a per fight basis, fights like Wind Lord Mel’jarak 25H:
completely skew the results.
Finally, because of an issue with World of Logs, Heroic parses for Terrace of the Endless Spring do not show up.
Fight-by-Fight – What do we have to look out for?
So if we don’t look at the fights aggregately, then the alternative is to look at each fight on its own, and try to draw our conclusions from that. Maybe Affliction is top for every fight, or near the top. In that case, we might argue that is overpowered. Even in looking at each individual fight though, we have to keep certain things in mind.
The first is that of sample size. This is an argument for comparing only the dominant spec for each fight. There has been a persistent complaint ever since Patch 5.1, when a change to Kil’jaeden’s Cunning made it so that it was no longer a DPS loss to move, but simply a survivability loss (from the snare). Affliction’s attackers would say ‘look, Demonology is supposed to be the movement spec, yet it still lags behind Affliction on movement fights like Vizier and Blade Lord!’
Here’s the factor that everyone else is ignoring:
There’s a player skill difference represented there – almost all Warlocks are playing Affliction, and min-maxers as raiders tend to be, a disproportionate number of the most skilled raiders are playing Affliction as well. In addition, using DPSBot without looking at the sample size actually leads into making ridiculous comparisons. When I look at the 100th ranked Demo lock, he is actually around 55th percentile for that spec. When I look at the 100th ranked Affliction lock, he is 95th percentile.
Any comparison that has such a disparate sample size should be thrown out – you might look at the top ranked parse from each spec, but even then it’s impossible to say whether someone that’s the #1 Destruction parse would be the #1 Affliction parse or the #100 Affliction parse.
The second is that of role. The Will of the Emperor example covered a bit of it in detail, but there are a good number of fights where roles are simply so different that comparing Melee to Ranged or DKs to Rogues is almost meaningless. Because of a lack of in-fight utility, Warlocks in general have the least challenging responsibilities. Warlocks might be asked to CC using a spear on Wind Lord Mel’jarak, but not before a Mage is already on Polymorph duty. Likewise, Dark Bargain is an excellent soaking tool on Heroic Elegon, but it’s not as strong as a Mage’s Greater Invisibility. Our hard CCs are mostly Fear-based, so not ideal for keeping targets immobile. Because of this lack of active utility, we might expect Warlocks to outperform slightly in real world situations.
How do we account for this? There are fights where utility doesn’t factor in. Feng the Accursed is a fairly straightforward fight unless you’re called upon to soak Lightning Fists. On Garalon Heroic, pretty much every ranged DPS is needed for Pheromones. Both Vizier and Blade Lord are gimmick-less fights.
Mostly though, except for the most egregious cases, we don’t account for this. We can make the assumption that the best parsing members of each spec are those that were allowed to scumbag DPS their way to the top of the meters, no responsibility holding them back. It is something to keep in mind though, if someone ever says ‘look how bad Frost DKs are on Will of the Emperor Heroic!’
The third thing to look out for is the hybrid tax – the idea that hybrids sacrifice throughput for the ability to offer up utility in a variety of aspects. Officially, it may or may not be retired. In reality, it lives on – although it might be exacerbated by the fact that PvP balance for hybrids is in shambles at the moment.
It looks something like this:
At the bottom, you see all the caster/heal off specs (Balance, Elemental, Shadow) languishing. The worst ‘dominant’ pure spec is Assassination, at 127.8k, while Affliction and Arcane sit a cut above. There are a few ways to interpret this data:
- Arcane and Affliction are too strong! 14% stronger than Shadow and 19% stronger than Elemental!
- Shadow, Balance and Elemental are too weak. There’s no reason to bring these specs!
- Arcane is less than 8% stronger than any other dominant pure DPS spec – that’s pretty good balance!
I’ve read comments arguing both 1 and 2 on Warlock forums, often both at the same time. “Nerf Affliction, at 20% worse than Affliction, there’s no reason to bring Shadow!”
Is that so?
The fact of the matter is, there are plenty of reasons to bring a Shadow Priest even despite the gap in throughput. Every class brings utility, but hybrid utility stacks (the more Hymns, the merrier) while pure utility does not (we don’t have 6 Healthstone charges). Many, many guilds this tier have found reason to sacrifice throughput for this utility – perhaps the most prominent examples can be found in both our World First guilds – Paragon brought 2 Balance Druids and 1 Warlock, while Method brought 3 Balance Druids and 2 Warlocks to their Heroic Sha of Fear kills.
Is the current state of hybrid vs. pure DPS ideal? There are arguments to be made for both sides. But whether or not the intent behind the hybrid tax exists, the fact is that it appears to, and as such, you cannot compare hybrid heal/caster specs to pure DPS specs.
Finally, fights that are classified as gimmicks. We can just run down the list here: Stone Guard is a gimmick fight, Garajal is a gimmick fight, as is Elegon. Will of the Emperor is a fight with widely disparate roles. Garalon is a fight with melee gimmicks, Wind Lord is a fight that puts up cartoonish numbers, Amber-Shaper is a gimmick fight. Protectors of the Endless has a lot of potential for padding. Sha of Fear is all over the place, depending on strat.
This isn’t to say that we should ignore all these fights. Affliction certainly is the best Elegon spec. It’s still important to apply critical thinking to see if these inflated gimmick numbers would translate to an actual problem: class stacking.
With all these issues in mind, anyone can look through and see how the evidence stacks up in favor or against Affliction being too strong. Keep in mind that being ‘first’ on meters doesn’t mean anything by itself – it’s being first by a lot, or like I mentioned before, being so far ahead that it leads to class stacking, that’s when we’re in trouble.
So is there a class-stacking issue?
A little bit. The results from Feng are for a beginning level Heroic. The two most difficult fights that we have data from RaidBots for are Vizier 25H and Grand Empress 25H. Both these fights show a bit of spec stacking, where the top two specs get more representation than in other fights:
For the sake of balance, Affliction probably deserves a tiny nerf (beyond the 5.2 Glyph of Sacrifice nerf). But Kil’jaeden’s Cunning shouldn’t be where it comes from, because while I left it off my first chart, look at where Demonology is:
That’s it for this year! I saw the complaint that there’s not enough theorycrafting for Warlocks today, so going forward in the New Year I’m going to start posting on some really mathy topics in order to offer the community something different. Happy New Years, everyone.