Gender Bias in a Data Driven World

by Ines Quandel

Have you ever wondered how Artificial Intelligence, becomes, you know, intelligent? Artificial Intelligence, AI for short, is coded to ‘learn’ and they do so by accessing existing data. The learning can be either supervised or unsupervised by humans. Either way the AI is programmed to detect patterns in the data it is given for training. Then the AI uses algorithms (a set of programmed rules) to apply those learned patterns to perform a task.

There is an underlying problem to AI learning, it is biased data. We, as a society, have deeply ingrained biases most of which are so much part of our lives that we still don’t recognize them as such. Gender bias is subtler and more pervasive than you might realize and can be hard to pin point.

One of the ways our gender bias shows up is that most data collected for all of us, actually comes from men. Carolina Criado Perez writes about this lack of female specific data in Invisible Women: Data Bias in a World Designed for Men. She also points out that there is also a lack of sex segregated data, even in research where physical differences between men and women have an impact on the decisions made based on that data. Examples include design of tools, bullet proof vests, and cars seats, all of which are based on the average size, weight, shape and strength of the average male.

How will decisions made by AI based on gender bias data affect women? An example given in Invisible Women is translation software that translates the gender pronoun for doctor as “he” from the female form “la doctora”, because based on its inputs doctors are most often men. Data from the internet is not the only problem data. What if the AI used for reviewing resumes is trained on the same biased data as the translation AI? Would it automatically exclude all resumes with female names and pronouns? If it does then thanks to this AI your hiring systematically discriminates against women.

These examples seem obvious. Now that you know what you’re looking for you can probably find a few more. As mentioned though, some of our biases are not as easy to spot, so what can we do to identify them. At this point you may be wondering how we know that data is biased.

Turns out that one AI programme has already identified some of the biases in our data. In the article Bias is Ingrained in Society – A Recent Artificial Intelligence Model (GPT-3) Proves this the author Paul Orkutny writes about the results of using AI to comb through internet data to identify a variety of biases. The AI searched the internet for the top 10 most common words used in combination with ‘men’ and with ‘women’. It turns out we describe men and women very differently. Orkutny reminds us that “Bias in artificial intelligence is not a new problem and yet it is, in my opinion, infrequently considered. AI presents itself as unbiased and we often use it blindly in an attempt to correct our biases. However, we need to consider our ingrained societal biases before we put blind faith in the computer model.”

That is a call to action for all of us to be aware of the biases that can enter the emerging world of artificial intelligence. Getting this right will make our future more inclusive and better for us all.