Is Warhammer Balanced?

26Feb09

So this blog is rapidly becoming a public repository for thoughts both nerdy and statistical. Not entirely sure how I feel about that, but it would be a shame to break precedent. Today we’ll be talking about miniatures wargames. For those of you who have escaped this time consuming, expensive, and somewhat oddball hobby, the gist is this: Players, using small painted figures (essentially toy soldiers) representing factions either real or imagined, fight out battles using a set of rules. Think complicated chess and you’re on the right track. Or Google “Warhammer”.

The question today is: Is the miniatures wargame “Warhammer”, or its sci-fi sibling “Warhammer 40,000” balanced – meaning can a player using one faction reasonably expect to beat another player using a different faction because of their skill or luck, rather than one faction being inherently more powerful? Answer after the jump.

How to go about answering this? Statistics!

Methods: The data for this analysis was pulled from the publicly available results of a recent tournament, the ‘Throne of Skulls’, from http://warhammerworld.typepad.com/warhammer_world_news/events.html. Each tournament had three “heats”, which were pooled together to get both the largest sample size and largest number of players for each respective game. I then added a variable representing the order in which the supplement for the game (an “Army book” or “Codex”) was released – this will come up later. Then, in SAS 9.2 and JMP 7, both by the SAS Institute, I ran a one way ANOVA to examine the mean “Gaming Total”, or score for the tournament, over the various factions, with pairwise comparison done post-hoc with Tukey’s HSD.

To answer a secondary question, are newer army books more powerful (“Codex Creep”), I performed a logistic regression to see if the odds of placing in the top 95th percentile of players differed between players using newer and older army books. Competitors missing data on what faction they play were excluded from the analysis, as were two Space Marine players who had negative scores, presumably due to being appalling sportsmen.

Results:

Fantasy:

Players of Warhammer Fantasy Battles had some clear favorite tournament armies, most notably Daemons of Chaos and Vampire Counts, followed up by Dark Elves and then a smattering of everything else (Figure 1). This likely reflects the player perceptions that these are particularly “powerful” armies.

Figure 1

So what then of the actual results? Daemons of Chaos did have the highest mean score (108.7) of a maximum of 180), while Beasts of Chaos had the lowest (60.7) – Figure 2, some nice boxplots of the performance of the armies.

Figure 2

Note the considerable variation in the performance of all the armies. In pairwise post-testing, the “high performing” armies of Vampire Counts, Daemons of Chaos and Dark Elves were only significantly better than a handful of armies, notably the Empire, High Elves, Dwarfs, Orcs & Goblins and Beasts of Chaos. Most of the rest are in the middle ground, and we cannot rule out the current results being entirely due to chance. When I saw this on the the gaming monitor I had set up then I was pretty convince to stop…

As for the results of the logistic regression: The odds of a player placing in the top 95th percentile of players was 1.35 (95% CI: 1.09, 1.67) times that of a player using the next oldest army book, indicating an increasing likelihood of doing well using books that have been more recently released.

40K:

Warhammer 40,000 players had some clear favorites as well, with Chaos Space Marines, Space Marines and Eldar occupying the top three spots (Figure 3).

Figure 3

From the results of the ANOVA, Orks, the 4th most popular army, had the highest mean score (102.5, again out of a possible 180), while the Black Templars had the lowest (67.6). Figure 4 has the boxplots of the various army performances.

Figure 4

Observe there’s somewhat less variation than in the Fantasy data – the only statistically significant difference is between Orks and the Space Marines, a surprisingly popular yet low performing army (mean score = 76.0). The logistic regression yielded similarly contrary results. The odds of placing in the 95th percentile for a player is 1.07 (95% CI: 0.95, 1.21) times that of a player using the next oldest codex – a relationship that, again, may very well be due to chance alone.

Discussion:

So what’s this all mean? Well, first, a caveat. As with all studies, this one has limitations. Of most concern to me is the fact that in the Fantasy data, the most popular armies are also the best performing. It is possible that the best players gravitate towards “better” armies, and what we are actually seeing is better players, not better armies, placing higher. This might also explain why the Space Marines have a lower average score than the Blood Angels, Dark Angels and Space Wolves – all Space Marines derivatives with older, and arguably less mechanistically powerful, army books. In short, it is likely our study has some residual confounding, although I hoped by using active tournament players we can at least partially account for player skill. An independent, quantitative measure of “skill” independent from tournament standings eludes me. As an aside, this problem is extremely common in Epidemiology, and is known as “confounding”. Hence the name of this blog.

But, in short, the answer is no, the game is not balanced. Warhammer Fantasy especially suffers from several overperforming army lists, as well as statistically significant Codex creep. Warhammer 40,000 seems to suffer less from these issues, although the overwhelming popularity of the Space Marines (and their Chaos cohorts) among new players may be masking some effects. Never the less, for the moment, it appears in Warhammer 40K tournaments, the winner may be comfortable in the conclusion that his victory is due to skill and the dice, rather than what book he bought.

Advertisements


18 Responses to “Is Warhammer Balanced?”

  1. 1 munch

    while what you say is true in many countries like Australia we have a tier system this is designed to counter act this.

  2. This is an interesting analysis. Thanks!

  3. I would recommend modeling this using a per user skill factor and a per game advantage factor. With good analysis (not just a canned SAS routine), this will tell you a couple of things:

    a) is army choice hopelessly confounded with skill due to self-selection?

    b) can you draw strong conclusions about army strength?

    c) can you detect individual skill differences (I should hope so)?

    The method I would recommend would be something like response theory except that you would be modeling a probabilistic outcome based on difference in adjusted skill rather than outcome versus skill and difficulty.

    It should be relatively straightforward to build an MCMC simulation of the posterior distribution of the parameters and that would let you integrate over players to answer the questions posed above.

  4. 4 Epi_Junkie

    Ted-

    Not terribly sure the dig at “a canned SAS routine” is necessary, given the features of the program allow for extremely sophisticated analysis. Seriously, time to event studies with multiple events, time varying exposures and correction for measurement error are all “canned SAS routines”.

    What’s more important is that the analytic strategy used is appropriate for the data in question – notably, the whopping three columns of the tournament results that are useful. Yes, there are *much* more nuanced analysis strategies one could use, to measure all kinds of interesting things. The problem however is that the data is simply not available to do it.

    A simple logistic regression, especially one that acknowledges its shortcomings in terms of being unable to rule out residual confounding due to player skill, is considerably closer to a “true” picture of army balance than the previous “gold standard” of common-knowledge, especially since there is a great outpouring of angst about each and every army book released.

  5. 5 George Pratt

    did you take into account the flexibility of armies lists that players can choose? Some armies are flexible enough to create lists that could potentially hold up to many different armies. At least on paper. That would be a huge factor in how closely related the different armies are to each other. Sometimes people make bad lists and suffer for it.

    • 6 Epi_Junkie

      An interesting question – the short answer is “No”, due to limitations in the data I was working wiht.

      The long answer is, in my definition of things “Is X army markedly better than Y army” encapsulates sub-questions like “How flexible is this army?” or “Is this particular army optimized against the *other* armies that appear at the tournament”. Those are potential causal explanations for why one army does better than another, along with “Maybe they’re list is just better”.

  6. 7 Reecius

    Well said! Great read and thanks for taking the time to write that up, very informative and confirms what most of us already know.

  7. 8 John

    It looks like you used overall scores?

    If you’re talking about (power) balance, perhaps only use the Battle Scores, ignoring the Sports, Paint, Comp, bonus scores?

  8. 9 dwarvin armor

    i believe that demons might be overpowered but i still can win against anyone with my dwarf army

  9. 10 William Matthews

    The trouble with Warhammer Demons and Vampires is the Unbreakable rule they enjoy across there whole army. The hope is that new combat rules in 8th edition will address this clear and painfull advantage.

  10. wow remarkable work. this post was just a little above me, do you’ve got any sources for beginners who’re searching to far better realize this?

  11. Thank you very much for these efforts. I’d love to see this project kept up to date on a rolling basis!

  12. 13 Royos

    Ok Geeks, first of all, respect to the article and responses. An interesting topic. Deserves much discussion. I dont have much of a brain, so don’t expect any analysis or conclusions, but I do have a tone of experience playing Warhammer (I started playing around 1996-7) so i can throw in a comment. My Warhammer buddies all agree that Chaos is the most powerful army, and has a fairly flexible army list to boot. But this is old news. I personally feel that Chaos armies have always been pretty powerful, even since the old days. My first big whoopn came from a Chaos army, (I was using Skaven) and its a whoopn Ill never forget. Since then I have managed to counter act the raw power of Chaos in many ways, my Lizardmen usually stand up well to the Chaotic threat, however in capable hands, a Chaos army will make you work hard for a victory.

  13. I think warhammer is a balanced games. It personalized it’s players well and organized.


  1. 1 Warhammer 40k, Statistics, and Power Creep
  2. 2 Top Posts « WordPress.com
  3. 3 My suspicions are confirmed! ….well, kinda’ « Imaginary Wars Blog
  4. 4 Who Visits This Site? By The Numbers… « Confounded by Confounding

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


%d bloggers like this: