Gender bias in the L3 advancement process

I would like to introduce a great analysis by Nicolette Apraez & Anniek van der Peijl about the impact of gender in the L3 advancement process. The analysis is based on global data and paints a picture from a new perspective on the advancement process.


We conducted a research project to investigate whether there is gender bias in the L3 advancement process. This project covered 3 main lines of investigation:

  • Data from panels conducted between 2012 and 2017
  • A questionnaire asking people about their perceptions regarding gender bias.
  • Qualitative data from comments and statements.

The results showed the following:

  • It looks like women are more eager to attain L3 and are encouraged more to pursue L3 than men are. This is also reflected in the greater representation of women among higher judge levels.
  • There are differences between the genders in the L3 qualities that candidates are rated deficient (minor or major) in.
  • Women on average feel that they are being treated and evaluated differently because of their gender, and consider the PEI and panel as less fair than their male counterparts.
  • Gender bias is a ‘hot topic’. Opinions on it appear to be strong and widely divergent.

Based on these results it appears that while people feel there is gender bias in the process such that women are at a disadvantage, women actually report receiving more encouragement, and are better represented among L3 judges than among L2 judges. In short, it appears that there is no bias at work that is actually keeping women from making L3. We do suggest making the L3 process anonymized where possible to reduce possible bias and including as much transparency as possible to reduce the perception of bias.



During a leadership meeting at a GP, the topic of potential gender bias in the L3 advancement process was raised (link to judge apps meeting report thread here). As a spin-off from that discussion, we set out to gather some data on the subject. We started a research project so we could say with more certainty whether that bias indeed exists and if so, what it looks like. This project covered 3 main lines of investigation:

  1. We analyzed the panel data available from the advancement committee. This can be considered as the most ‘objective’ data, since it reflects real panels taken by real people with real results.
  2. We ran a short questionnaire aimed at L2 and L3 judges, asking for their experiences with and views on the L3 advancement process and gender bias. This was intended to find out the subjective perception of the process.
  3. We collected statements from individual people who wished to share their views with us. This is responses collected in the ‘comments’ box of the questionnaire and follow-up conversations from the questionnaire. The aim of this was to get an impression of the views and concerns that are held in the community, and to make these available as feedback on the L3 process and the program.

This report contains the results of our investigation, in the order listed above. Please note that if you would like more details on any of the results (like exact numbers or standard deviations), you will find them in the Appendix.

At the end, we provide an overall conclusion and suggestions to improve the L3 process.

*A note on the representation of people identifying as LGBTQ+: The questionnaire did allow people to indicate their identity in a number of categories other than ‘male’ and ‘female’ as well as a range of sexual orientations. However the number of respondents in some of these categories was very low, making statistical analysis less feasible, and the number of categories alone would potentially make this report overwhelmingly complex to interpret. Therefore, for this particular report you will not see results of the questionnaire split by those identity categories. However, if anyone would like to begin research into seeing if our findings carry over to non-binary groups (or indeed any other questions you’d like to answer with our data), we’d be happy to share an anonymized version of the survey responses we collected.


Analysis of L3 panel data

We analyzed the panel data of 158 candidates that were in the advancement process between 2012 and 2017.


The percentage of female candidates in this group was 10.2%, which is higher than the representation of women at L2 (at 7%). This suggests that women are more likely to pursue L3.

This data is also consistent with the observation that female representation among judges goes up among the higher levels. Based on 2017 census data collected by RCs the percentage of female judges at each level is as follows:

L1 L2 L3
5% female 7% female 15% female

Of 131 candidates panels (the remaining candidates were either in pre-panel state or were missing the panel member information) we had data from the ‘other side of the table’ too, allowing us to see representation of women among panel members (lead or otherwise). Among the people administering the panels during that time, there was a 9.5% chance of a panel member being female. Because a panel is made up of multiple people, the percentage of panels that had at least one female member on it was higher, at 22.1%.


Below is a graph indicating how male vs. female candidates’ qualities were rated by the panel. It shows which percentage of candidates received either a minor or major deficiency in each of the L3 Qualities, split by gender. Please note that due to changes in the list of Qualities during the period in which this data was collected there are some duplicate/similar categories.

From this graph, we notice that there are some qualities for which no female candidates have been rated deficient while some male candidates have.

This result is a bit difficult to interpret from a data perspective because the number of female candidates is low, so the result can easily be skewed. But also from a theoretical perspective this could mean multiple things: Are the gender differences there because the panel is biased, or are the differences real and potentially there because social conditioning has taught different skills to each gender?

From a perspective of traditional gender stereotyping, the results are conflicting: It may be considered a stereotype that men score higher on Leadership, Presence & Charisma (which is shown in these results in the shape of fewer deficiencies), but a similar stereotype that men would score better on Stress & Conflict Management is not shown by our data. Indeed, the largest difference between male and female candidates is in Penalty and Policy Philosophy, which is so M:TG specific that it seems unlikely that there are societal gender stereotypes about that quality at all! However, this could be evidence of pushing and encouraging female candidates who have the leadership, logistics, etc skills without diving deep into their understanding of things like investigations or penalty and policy philosophy, as we discuss in our questionnaire analysis.


Analysis of Questionnaire data

A total of 367 judges responded to our questionnaire, of which 330 identified as male and 31 as female.

Representation and ambition

We asked people where they were on the road to L3. There are some differences of note here. Of male respondents, around 32% were L2 without interest in L3, versus 16% of female respondents. The women in our sample are more likely than the men to be working on their checklist (26% vs 15%) or still going despite having failed somewhere along the way (10% vs 2%).This data suggests that women in the judge program are more ambitious than their male counterparts: there are fewer of them with no interest in L3 whereas there are more of them actively working towards L3.

Part of this phenomenon could be explained by peer encouragement. We found that women report higher levels of encouragement from other judges (both L2 and L3) to pursue L3. This may translate into higher percentages of female judges having interest in L3 advancement.

Perception of bias

As described in the introduction, we also wanted to get a feel for how the community perceives the L3 process in terms of gender bias. People’s perceptions shape their behavior, so while perceptions are subjective, they are still important.


General bias

First of all, the results of the questionnaire showed that women feel much more than men that they are treated differently because of their gender:

Please note that being treated ‘differently’ could mean positive or negative or value-neutral differences. So in the next two questions we asked whether people thought their gender was helping or hindering them in their judge career.

We can see that men and women feel about the same about their gender impacting their odds of getting accepted to events, which also makes sense with the fact that their answers are close to a score of 3, meaning ‘this doesn’t matter’. For L3 advancement on the other hand, women feel they have a harder time because of their gender. This appears to be in conflict with the result (detailed above) that women report receiving more encouragement to pursue L3.

One possible explanation is that, since the question’s wording includes sexual orientation in addition to gender, sexual orientation may be a factor here. Unfortunately, only 171 participants (out of 367) ticked any of the sexual orientation categories. So if we wanted to split the answers to these questions 4 ways we would have to leave out a large number of participants and be left with one category representing as little as 3 participants. As a result, the question of sexual orientation being a factor here will remain unanswered with our current data set.

Bias in L3 advancement

Next we will look at the perception of bias in the concrete steps of the L3 advancement process, those steps being:

  • The checklist verification
  • The pre-event interview (PEI)
  • The panel

Because we recognise that especially the PEI and panel are quite personal in nature, these are necessarily at least somewhat subjective. This does not need to mean that they are unfair or biased, so we asked questions about both the perception of fairness/bias, and perception of objectiveness/subjectiveness.

Within the L3 process, the checklist verification is rated most objective, followed by the PEI and the panel. This order is the same for both genders, with women rating each activity more objective than men did. The result for fairness is slightly different, with each activity being rated roughly the same in terms of fairness by men, but women rating the checklist as much more fair than the rest of the process. Also note that while women thought the PEI and panel were more objective than men did, they rate the fairness of it lower than men.

So either women have a tendency not to equate objectivity with fairness as much as men do, or men are more optimistic that subjectivity does not compromise fairness. The latter would be consistent with the results that women perceive that there is more general gender bias.


Qualitative data

In addition to the quantitative data discussed above, we also had access to various sources of qualitative data, i.e. people’s stories and opinions. For example the comments people left on the questionnaire, things people told us as a follow-up on the questionnaire, forum posts and statements about the topic. We will discuss commonly occurring themes and opinions here.

Most of the comments left on the surveys or made during follow-ups can be broken into a few different categories: Questionnaire Feedback, Positive Support, Unrelated to our Scope, or Contrary Beliefs.

Since the “Thanks for doing this” and “Add an option for X” don’t currently help us, I’m going to focus on contrary beliefs, as well as a few points that, while unrelated to our current scope, are worth mentioning might also cause differences in the Advancement Process for individuals.


Unrelated to our Scope

We were tasked with determining whether or not we believed that there was a systemic issue with Gender Equality in the L3 Advancement Process. A large portion of suggestions/comments fell outside of our project’s scope. Nearly 16% of our participants who left comments told us that they believed that there were other issues besides gender that we should have been looking into. The comments suggested that Racism, Societal differences, Financial Disparities, and Sexual Orientation are more likely to influence a person’s Advancement Process than gender. While we would have loved to have delved deeper in our research to include information about these groups, we wanted to ensure that our original scope and purpose did not change.


Contrary Beliefs

Another 15% of participants who left comments believed that either they did not believe or had not seen women treated any differently, or that women actually experienced an easier time in the judge program. While some believed that the Program is essentially gender-blind (citing Italy as an example, due to Christiana’s past and present roles in the Program), others argued that women are given more blind chances, and feel as though the program is unfairly giving opportunities to women more often than men. To those participants, we have acknowledged the discrepancy in support is tilted towards women for our study, but as we will go into below, a gender bias is extremely difficult to measure objectively.



While we do feel we were able to get a better sense of the average female’s Level 3 process, much of that data came from our questionnaire. This means it could be subject to bias, as those individuals willing to participate are likely to be those men and women who are more invested in the Judge Program. In addition, the questionnaire data reflects people’s perceptions of bias, which may not fully match with actual bias. However, we do think that the perception of bias is important: What people believe to be true about the world around them affects their behavior, so if we want to understand behavior, looking at perceptions becomes inevitable.

Some questions cannot be answered with the data available. One is the pass/fail rate of panels, since data on past failed panels is not kept by the L3 advancement team. On top of that, the overall low number of female candidates means it’s difficult to identify trends in a statistically meaningful way. For example statistics on the gender of the candidate vs. genders represented on the panel just results in very small group sizes.


Conclusions and Suggestions

In summary, it appears that:

  • Women are more eager to attain L3 and are encouraged more to pursue L3 than men are. This is also reflected in the greater representation of women among higher judge levels.
  • There are differences between the genders in the L3 qualities that candidates are rated deficient (minor or major) in.
  • Women on average feel that they are being treated and evaluated differently because of their gender, and consider the PEI and panel as less fair than their male counterparts.
  • Gender bias is a ‘hot topic’. Opinions on it appear to be strong and widely divergent.

The most interesting and conflicting thing about our findings is that while women feel it’s harder for them to attain L3 than men and consider the PEI and panel to be less fair than men do, at the same time they also report more encouragement to pursue L3. This is also shown in the fact that there is more female representation at L3 than any other level. It seems reasonable to speculate, then, that while there is a fairly strong perception of negative bias towards women, the end result in practice does not support the notion that women are being held back. Of course some of this can be explained by ambition, which lies with the women themselves rather than the community or the advancement system, but some credit should also go to the encouragement given to them.

One matter of concern is that PEI and panel are rated as quite subjective and halfway between fair and unfair by both men and women. While women appear to consider the process on the whole as more objective than men do, the reverse is true for the perception of fairness. The L3 process could be improved in two ways. The first is to make the process more transparent, for example by providing information about how the PEI and panel are ‘graded’. This would allow candidates to get a better sense of the nature of how they are being evaluated, and give them more information to determine whether they find this process fair. The second improvement could be to make the process more gender-neutral by, for example, having as much of it as possible be done anonymously. This mostly applies to the PEI since panel have to be administered in person. Similar to the way the GP HJ applications are handled, there could be a committee that handles the administrative side of things who know the identity of the candidate, and who provide anonymized input to the person handling the PEI.  

However, likely the biggest disclaimer here should be obvious: The notion that there doesn’t seem to be a structural gender bias problem in the L3 process does not mean that it’s impossible for individuals to be treated with bias. These situations should still be taken seriously whenever they come up.


Appendix – Raw Data

Panel data:

Relative percentages    
Minor or Major deficiencies female (n=13) male (n=117)
Rules & Policy Knowledge 0 7.692307692
Attitude & Maturity 0 4.273504274
Judge Assessment 0 12.82051282
Mentorship 0 0.8547008547
Teamwork and Diplomacy 0 5.128205128
Communication 0 0
Investigations 30.76923077 28.20512821
Leadership, Presence & Charisma 30.76923077 20.51282051
Penalty and Policy Philosophy 46.15384615 27.35042735
Program Construction & Philosophy 23.07692308 20.51282051
Self-Evaluation 0 9.401709402
Stress & Conflict Management 15.38461538 17.09401709
Teamwork, Diplomacy and Maturity 0 12.82051282
Development of Other Judges 7.692307692 16.23931624
Logistics and Tournament Operations 0 0


number of candidates   % number of panel members   %   panels containing at least 1 female member
female 16 10.1910828 female 34 9.470752089 total panels known 131
male 141 male 325 total panels containing female 29
total 157 total 359 percentage 22.13740458


2017 Census:

  Total L1 Women L1 Total L2 Women L2 Total L3 Women L3 % female L1 %female L2 %female L3
Italy 208 5 51 3 8 1 2% 6% 13%
EU North 224 5 46 2 4 2 2% 4% 50%
Australia 160 4 46 7 3 0 3% 15% 0%
Japan 281 5 83 0 5 1 2% 0% 20%
BeNeLux 146 4 22 0 10 1 3% 0% 10%
SE Asia 151 2 27 3 5 0 1% 11% 0%
UK 378 20 68 8 5% 12%
Iberia 183 11 54 5 8 1 6% 9% 13%
USA Great Lakes 295 22 50 4 7% 8%
USA Mid-Atlantic 285 23 87 9 8% 10%
USA North 153 10 50 2 7% 4%
USA Northwest 331 13 79 5 12 2 4% 6% 17%
USA South 234 14 60 7 6% 12%
USA Southwest 300 18 100 8 6% 8%
USA Canada 307 17 89 3 6% 3%
USA Central 233 12 53 2 5% 4%
USA Southeast 245 12 58 2 5% 3%
USA Northeast 428 18 100 5 4% 5%
TOTAL 4542 215 1123 75 55 8 5% 7% 15%


Questionnaire data:

  totals Level 2 – Without interest in Level 3 Level 2 – With general interest in Level 3 Level 2 – Working on L3 Checklist Items Level 2 – Currently in PEI or waiting to Panel Level 2 – Failed PEI, Test, or Panel, still interested in Level 3 Level 2 – Failed PEI, Test, or Panel, no longer interested in Level 3 Level 3 – Failed PEI, Test, or Panel but continued on to Level 3 Level 3 – Passed through PEI, Test, and Panel on first try
male 330 106 117 50 5 8 3 12 29
female 31 5 11 8 0 3 0 1 3


average scores I feel that I am treated differently based on my gender identity or sexual orientation. My skills are evaluated differently based on my gender identity or sexual orientation. I receive comments or feedback related to my gender identity or sexual orientation. (For example: “You are quite [x] for a [guy/girl]”) I feel that L3 advancement is _____ for people of my gender identity and/or sexual orientation. I feel that getting selected to judge events I applied for is _____ for people of my gender identity and/or sexual orientation.
male 1,50764526 1,535168196 1,283536585 2,657407407 2,836923077
female 3,451612903 3,290322581 2,35483871 3,548387097 2,709677419
male std dev 0,9841237996 1,041096587 0,791064582 0,7567492442 0,7902900762
female std dev 1,362319338 1,465003945 1,427080635 1,059519063 0,8638498476


average scores I think the L3 checklist verification is Fair/Unfair I think the L3 checklist verification is Objective/Subjective I think the Pre-Event Interview (PEI) is Fair/Unfair I think the Pre-Event Interview (PEI) is Objective/Subjective I think the L3 Panel is Fair/Unfair I think the L3 Panel is Objective/Subjective
male 3,649122807 3,368421053 3,596491228 2,561403509 3,436363636 2,363636364
female 4,285714286 4,428571429 3,142857143 3 3 2,571428571
male standard dev 1,00873379 1,276596745 0,9231090541 1,149806565 0,9576908255 1,192004791
female standard dev 1,112697281 0,5345224838 1,345185418 1,414213562 1,290994449 1,272418021


12 thoughts on “Gender bias in the L3 advancement process

  1. Wow, congratulations for pulling out this work!
    I was wondering if you considered a possible correlation between the “qualities asimmetry” and the “fairness perception”. I mean, could it be that female candidates feel they are not examined enough about some qualities (the ones they are never found deficient in) or more scrutinized about qualities some people could think of as gender-biased? I would hope the panel is never this biased, but could it be a factor worthy to add to the ones you outlined above?
    It would have been useful an analysis of the variance in the samples, since they are numerically quite different, but it would have probably added a lot more work… Anyway, I find the discovered perception of decreasing objectiveness/fairness to be a very meaningful and precious result of this analysis.

    1. Hey Donato, because the questionnaire was anonymous, we can’t match any opinions/perceptions from there with the L3 panel results from the advancement project (which includes the deficiencies).

  2. Solid analysis and reasonable interpretation.

    “As a result, the question of _______ will remain unanswered with our current data set.”

    This is great to see in particular, as it’s a difficult thing to accept/say as an analyst, but an important factor in grounding the results.

  3. Do you have information about the pass rates for different parts of the L3 process for men versus women?

    1. Unfortunately not, or we would definitely have discussed it here. The data we got from the advancement team was essentially their spreadsheet for keeping track of candidates. It has one line for each candidate with their current status, but no history, so we wouldn’t know about stuff like previous failed attempts.

  4. All this is basically meaningless because you only polled 2.2% of the total population of judges.

    1. We deliberately polled only L2+ judges because they are the “target audience” for L3 advancement (for L2s) and the people who have gone through the process (for L3s).
      I will agree that this is potentially a limitation, but to call the whole thing meaningless seems a bit extreme.

  5. Thanks for the great analysis guys! I really appreciate the charts and numbers to back up the explanations you gave, it made the article really interesting and easy to follow.

    I would like to touch on something I found a bit worrying though:

    “However, this could be evidence of pushing and encouraging female candidates who have the leadership, logistics, etc skills without diving deep into their understanding of things like investigations or penalty and policy philosophy, as we discuss in our questionnaire analysis.”

    Perhaps I’m reading this wrong, but this reads to me like female judges are encouraged (and accepted) to pursue L3 without having the same competencies as male judges. Aside from the discriminatory nature of such a claim, this worries me because it makes me (I’m an L2, not interested in L3 track, but who tries to look up to L3s for mentorship) question the core judging skills of female L3s.

    Could you clarify a bit more what you meant by this statement? It seems rather controversial as written, but it was probably not written in the context I am reading so I’d like more information on the context here.

    1. What was meant by it, in a nutshell, is that due to the ‘pushing women to go for L3’ (which we see in the statistics about receiving encouragement) there might be a tendency for people to see a female judge who is prominent at events due to great logistics/leadership skills, and think ‘hey, she should go for L3’ without considering the other qualities.
      As explained in the paragraph before that statement, this is pretty speculative. Also I wouldn’t worry about women not being held to the same standard: It is the panel’s role to scrutinize every quality, and the fact that this statistic is in the report at all is because the panels rated more women than men as deficient in this quality.

      So in short: We think that MAYBE women who are not that great at policy philosophy are being pushed by their peers to go for L3, and this is ‘caught’ by the panel.

  6. “The most interesting and conflicting thing about our findings is that while women feel it’s harder for them to attain L3 than men and consider the PEI and panel to be less fair than men do, at the same time they also report more encouragement to pursue L3.”

    This is not necessarily conflicting and I dont think it leads to the reasonable speculation you made following it.

    It is very possible that women can receive more encouragement to pursue L3, but that it is ALSO more difficult for them. Encouragement doesn’t necessarily lead to a test or panel being easier.

    There are also so few women at L3 that I dont think we can draw any real conclusions about the process being harder for them based only on % of L3 being higher than % of L2.

    What is more likely in a circumstance like this(similar to women breaking into other fields historically), is the type of person who is willing to struggle with increased difficulty getting to L2, is also the type of person to shrug off and fight through increased difficulty to get to L3.

    There isnt any correlation between more women completing L3 and the test’s actual difficulty. Its very possible the women who have made L3 just worked harder.

    1. It’s definitely possible, it’s unfortunately something we can’t figure out for sure with the data we have.

    2. I had a similar reaction to this data. It could be that women who are judges have developed a peer group of judges around them that is supportive (and/or that peer group cultivated them into a judgeship in the first place), but that there is still a broader sense that they are treated differently within the judge/Magic community, including perhaps having some negative, even if relatively isolated, experiences in their evaluations (or otherwise).

Comments are closed.