Changing performance rating scales to interrupt gender bias

Summary

Research has shown that gender bias affects employee performance ratings: on average, women are judged as less able and worthy than men even if they have identical qualifications, performance, and behaviours. Is it possible for workplaces to reduce inequality in evaluations? In this two-part study, the authors examined how the design of performance evaluations affects gender bias in ratings. Specifically, their study found that using a 6-point rating scale rather than a 10-point rating scale eliminated the ratings gap between men and women. This evidence suggests that the structure of performance evaluations can considerably affect how men and women are evaluated, and consequently, how they are rewarded.

Research

Employee performance ratings are subject to bias. Numerous studies have shown that managers hold women to higher standards relative to men, and that women are less likely than men to be seen as ‘geniuses’ or brilliant. This negatively impacts women’s career successes and trajectories. However, little research has been done on how to reduce inequalities in evaluation. Therefore, the authors conducted two complementary studies to examine whether the structure of performance rating scales affects the manifestation of gender bias. They theorized that because there is a cultural association between brilliance and receiving a 10 out of 10, a 10-point scale disadvantages women.

Because there is a cultural association between brilliance and receiving a 10 out of 10, a 10-point scale disadvantages women.

The first study analyzed faculty teaching evaluation data from a professional school at a North American university. The school had been using a 10-point scale for evaluations, then switched to a 6-point scale (for reasons unrelated to this study). The data consisted of student ratings of the same instructors before and after the change. In total, this included 105,034 ratings of 369 instructors. Some areas of study were more male-dominated than others, so the researchers analyzed the data from male-dominated areas separately from those that were not male-dominated.

The second study used an online survey to show an identical lecture transcript to 400 students from across the United States. Each respondent was randomly assigned whether the instructor was a man or a woman. Each respondent was also randomly assigned a 10-point or 6-point scale with which to evaluate the instructor. They then had the opportunity to indicate the extent to which they viewed the instructor as brilliant, knowledgeable, nice, helpful, and hardworking.

Findings

Study 1: Shifting the rating scale from 10 to 6 points eliminated the gender gap in ratings in the male-dominated fields at the school observed in the study. With a 10-point scale, the most common score for men was 10 (31.4 percent received this score) and for women it was 8 (23.3 percent). However, with a 6-point scale, 6 was the most common score for both men and women (41.2 percent of men and 41.7 percent of women). The observed shift benefitted women because those who received an 8 or 9 with the 10-point scale tended to receive a 5 or a 6 on the 6-point scale. Even after controlling for various factors such as course quality, tenure-track, and the number of years since receiving a PhD, the effect still remained.

Study 2: In this online survey, when using a 10-point scale the instructor received an average rating of 7.8 when perceived as a man, and 7.1 when perceived as a woman. However, with a 6-point scale the instructor received an average rating of 4.9 when perceived as a man, versus 4.8 when perceived as a woman. The results once again showed that the gender gap could decrease with the introduction of a 6-point scale.

Interestingly, of respondents who gave the instructor a 10 out of 10, 65.7 percent strongly agreed that the instructor was brilliant. However, of those who gave a 6 out of 6, only 28.6 percent strongly agreed. Further, when respondents believed the instructor was a man, 15.5 percent strongly agreed he was brilliant. However, when respondents believed the instructor was a woman, only 9.5 percent strongly agreed they were brilliant. These results suggest that even though the 6-point scale reduced gender bias in the evaluations, respondents still considered the man instructor as more brilliant than the woman instructor. This corresponds with the notion that a 6 out of 6 is not as associated with brilliance as a 10 out of 10.

Even though the 6-point scale reduced gender bias in the evaluations, respondents still considered the man instructor as more brilliant.

Implications

Minor aspects of evaluations can have a major impact on careers – The study suggests that rating systems can drive workplace inequality. Since a 10 out of 10 is culturally associated with brilliance while a 6 out of 6 is not, changing a 10-point scale to a 6-point scale can benefit employees who, as a result of bias, may be deemed less worthy of a 10. In order to minimize inequality, workplaces should take into account that performance appraisals are connected to cultural beliefs and stereotypes, and should therefore test for the best evaluation methods.
Changing evaluation structures may interrupt gender bias, not eliminate it – Regardless of the rating scale, respondents in this study were still more likely to call the man instructor brilliant compared to the woman, even with an identical lecture. Rating scales are important, but what also matters is how leaders and decision-makers perceive these numbers, and whether they understand the effects of bias and stereotyping at work.

See more research briefs

Title

Scaling Down Inequality: Rating Scales, Gender Bias, and the Architecture of Evaluation

Authors

Lauren A. Rivera and András Tilcsik

Institutions

Duke University

Source

American Sociological Review

Published

2019

DOI

10.1177/0003122419833601

Link

https://journals.sagepub.com/doi/10.1177/0003122419833601

Research brief prepared by

Carmina Ravanera

Changing performance rating scales to interrupt gender bias