From “dependable” to “panicky”: How evaluations reinforce gender barriers in leadership

Highlights

In this study, women received just as much positive feedback as men in performance evaluations—but evaluators relied on only a handful of descriptors for them, while men’s praise spread across a much wider variety of words.
Women were tagged with significantly more negative traits than men, and over two-thirds of those negatives were feminine stereotypes (e.g., “panicky,” “temperamental,” and “passive”).
When women don’t match the “tough” and “decisive” leader image, evaluators default to gendered clichés to explain their shortcomings, quietly reinforcing barriers to women’s leadership.

In today’s organizations, performance evaluations are meant to spotlight talent and potential. Yet a 2019 study by researcher David G. Smith and colleagues reveals that the very words chosen to describe leaders can reinforce gender hierarchies rather than dismantle them.

Drawing on 4,344 anonymous leadership assessments of U.S. Naval Academy students, the researchers provided evaluators with a fixed menu of 44 positive and 45 negative traits—each pre‐classified as masculine, feminine, or neutral—to isolate how language choices vary by the target’s gender. By holding the pool of possible descriptors constant, the researchers were able to focus on evaluators’ implicit biases rather than differences in performance itself.

Skewed feedback: Narrow praise and gendered criticism

Despite receiving an equal total number of positive descriptors, women leaders were funneled into a much narrower set of praises. Evaluators repeatedly used only a handful of positive terms—such as “enthusiastic”, “compassionate”, and “organized”—when describing women, whereas men’s evaluations covered a richer variety of attributes—such as “analytical”, “competent”, and “practical”. In effect, although women demonstrated equivalent leadership qualities, the limited range of praise implied their successes only “counted” when described in those few, pre-approved ways.

More striking still, women attracted a greater share of negative descriptors than men did. Of the 14 negative leadership traits that showed a significant gender gap, women were tagged more often with 12 of them: selfish, opportunistic, vain, inept, frivolous, passive, scattered, gossip, excitable, panicky, temperamental, and indecisive. On the other hand, men were labeled more often on only two traits: arrogant and irresponsible.

It is especially telling that almost all of the traits applied more to women draw on classic feminine stereotypes—for example, emotional volatility (“panicky,” “temperamental”), passivity (“passive,” “indecisive”), and triviality (“frivolous,” “gossip”). This suggests that evaluators default to gendered tropes rather than objective performance standards when evaluating women leaders.

…almost all of the traits applied more to women draw on classic feminine stereotypes—for example, emotional volatility, passivity, and triviality.

In short, women received fewer varieties of praise and tougher, stereotype-based criticism. Evaluators didn’t hold them to the same neutral benchmarks applied to men; instead, they leaned on gendered clichés—using words like “excitable,” “emotional,” and “indecisive”—to explain any shortcomings. This double standard in language reinforces barriers that keep women from being seen as fully capable leaders.

When language reflects bias

These findings align with status characteristics theory, which can be thought of as a “fit test” between who we are and the role we occupy. We all carry a mental image of a leader: confident, decisive, maybe a bit tough. When someone doesn’t match that mould—for example, a woman stereotyped as warm or emotional—evaluators experience a mismatch and instinctively reach for familiar explanations. Instead of judging her by the same neutral standards they apply to men, they choose negative descriptors rooted in feminine stereotypes, using words like “panicky” and “temperamental.”

Instead of judging her by the same neutral standards they apply to men, they choose negative descriptors rooted in feminine stereotypes, using words like “panicky” and “temperamental.”

By both narrowing the range of positive praise for women and labeling them with these stereotype-driven critiques, evaluators, even if unwittingly, send the message that women don’t quite “fit” the leader role. This linguistic double bind erodes perceptions of women’s competence and agency.

Minimizing language bias in evaluations

Even well‐intentioned feedback systems can perpetuate bias through word choice. To rewrite this narrative, the researchers urge organizations to:

Anchor evaluations in objective metrics. Replace vague descriptors with behavior‐based criteria (e.g., “delivered 90% of team milestones” rather than “dependable”).
Audit and refine your lexicon. Regularly review evaluation forms to eliminate gendered descriptors and expand the pool of neutral performance terms.
Train evaluators on bias in language. Equip evaluators to spot gendered or stereotype-laden words, and instead use concrete, performance-based terms that accurate reflect a person’s leadership skills.

______

Research brief prepared by:

Alice Choe

See more research briefs

Title

The Power of Language: Gender, Status, and Agency in Performance Evaluations

Author

David G. Smith, Judith E. Rosenstein, Margaret C. Nikolov & Darby A. Chaney

Source

Sex Roles

Published

2019

Link

https://doi.org/10.1007/s11199-018-0923-7

Research brief prepared by

Alice Choe

From “dependable” to “panicky”: How evaluations reinforce gender barriers in leadership