“Statistics are used much like a drunk uses a lamppost: for support, not illumination.”
-Vin Scully, American sports commentator
HR department is asked to submit data on various HR parameters, including gender diversity like, male-female employees ratio, number of women recruited against number of women who applied for job, number of women in leadership position. One thing they should look for and be aware of while submitting data is- Simpson’s Paradox.
When you are showing relation between two variables (ex. gender and recruitment), you cannot always trust the relationship between two variables, there is one (or more than one) variable which is working in background, it is called “lurking variable”, this is not included in analysis, but it can substantially alter your interpretation of data.
In probability and statistics, Simpson’s paradox, is a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data.
We will take well known example of Simpson’s paradox-Berkeley gender bias case.
In 1973, University of California, Berkeley was sued for bias against women who had applied for admission to graduate schools there. The admission figures showed that men applying were more likely than women to be admitted, looking at data one could conclude that it was not due to chance.
But when department wise analysis was done, it appeared that no department was significantly biased against women. In fact, most departments had a small but statistically significant bias in favour of women.
This happened because “lurking variable” was not considered while filing discrimination suit. So what was the “lurking variable” here, which was causing Simpson’s paradox?
Lurking variable here was applying pattern of male and female candidates. Some departments were more competitive like E & F (i.e. they admitted less candidates) and some were less competitive like A & B (i.e. they admitted more candidates). A& B were favoured by male applicants and E & F by female applicants. So in amalgamated data, it appeared that male candidate was more likely to get selected.