AI programs exhibit racial and gender biases, research reveals

Machine learning algorithms are picking up deeply ingrained race and gender prejudices concealed within the patterns of language use, scientists say

An artificial intelligence tool that has revolutionised the ability of computers to interpret everyday language has been shown to exhibit striking gender and racial biases.

The findings raise the spectre of existing social inequalities and prejudices being reinforced in new and unpredictable ways as an increasing number of decisions affecting our everyday lives are ceded to automatons.

In the past few years, the ability of programs such as Google Translate to interpret language has improved dramatically. These gains have been thanks to new machine learning techniques and the availability of vast amounts of online text data, on which the algorithms can be trained.

However, as machines are getting closer to acquiring human-like language abilities, they are also absorbing the deeply ingrained biases concealed within the patterns of language use, the latest research reveals.

Joanna Bryson, a computer scientist at the University of Bath and a co-author, said: “A lot of people are saying this is showing that AI is prejudiced. No. This is showing we’re prejudiced and that AI is learning it.”

But Bryson warned that AI has the potential to reinforce existing biases because, unlike humans, algorithms may be unequipped to consciously counteract learned biases. “A danger would be if you had an AI system that didn’t have an explicit part that was driven by moral ideas, that would be bad,” she said.

The research, published in the journal Science, focuses on a machine learning tool known as “word embedding”, which is already transforming the way computers interpret speech and text. Some argue that the natural next step for the technology may involve machines developing human-like abilities such as common sense and logic.

“A major reason we chose to study word embeddings is that they have been spectacularly successful in the last few years in helping computers make sense of language,” said Arvind Narayanan, a computer scientist at Princeton University and the paper’s senior author.

The approach, which is already used in web search and machine translation, works by building up a mathematical representation of language, in which the meaning of a word is distilled into a series of numbers (known as a word vector) based on which other words most frequently appear alongside it. Perhaps surprisingly, this purely statistical approach appears to capture the rich cultural and social context of what a word means in the way that a dictionary definition would be incapable of.

For instance, in the mathematical “language space”, words for flowers are clustered closer to words linked to pleasantness, while words for insects are closer to words linked to unpleasantness, reflecting common views on the relative merits of insects versus flowers.

The latest paper shows that some more troubling implicit biases seen in human psychology experiments are also readily acquired by algorithms. The words “female” and “woman” were more closely associated with arts and humanities occupations and with the home, while “male” and “man” were closer to maths and engineering professions.

And the AI system was more likely to associate European American names with pleasant words such as “gift” or “happy”, while African American names were more commonly associated with unpleasant words.

The findings suggest that algorithms have acquired the same biases that lead people (in the UK and US, at least) to match pleasant words and white faces in implicit association tests.

These biases can have a profound impact on human behaviour. One previous study showed that an identical CV is 50% more likely to result in an interview invitation if the candidate’s name is European American than if it is African American. The latest results suggest that algorithms, unless explicitly programmed to address this, will be riddled with the same social prejudices.

“If you didn’t believe that there was racism associated with people’s names, this shows it’s there,” said Bryson.

The machine learning tool used in the study was trained on a dataset known as the “common crawl” corpus – a list of 840bn words that have been taken as they appear from material published online. Similar results were found when the same tools were trained on data from Google News.

Sandra Wachter, a researcher in data ethics and algorithms at the University of Oxford, said: “The world is biased, the historical data is biased, hence it is not surprising that we receive biased results.”

Rather than algorithms representing a threat, they could present an opportunity to address bias and counteract it where appropriate, she added.

“At least with algorithms, we can potentially know when the algorithm is biased,” she said. “Humans, for example, could lie about the reasons they did not hire someone. In contrast, we do not expect algorithms to lie or deceive us.”

However, Wachter said the question of how to eliminate inappropriate bias from algorithms designed to understand language, without stripping away their powers of interpretation, would be challenging.

“We can, in principle, build systems that detect biased decision-making, and then act on it,” said Wachter, who along with others has called for an AI watchdog to be established. “This is a very complicated task, but it is a responsibility that we as society should not shy away from.”

Contributor

Hannah Devlin Science correspondent

The GuardianTramp

Related Content

Article image
Automated virtual reality therapy helps people overcome phobia of heights
Scientists hope computer programme which requires no human therapist could be used to treat other mental health problems

Nicola Davis

11, Jul, 2018 @10:30 PM

Article image
AI will create 'useless class' of human, predicts bestselling historian
Smarter artificial intelligence is one of 21st century’s most dire threats, writes Yuval Noah Harari in follow-up to Sapiens

Ian Sample Science editor

20, May, 2016 @12:20 PM

Article image
AI project to preserve people's voices in effort to tackle speech loss
Clinic hopes to help those at risk of losing ability to speak maintain sense of identity

Nicola Davis

09, Nov, 2019 @8:01 AM

Article image
AI systems claiming to 'read' emotions pose discrimination risks

Expert says technology deployed is based on outdated science and therefore is unreliable

Hannah Devlin Science correspondent

16, Feb, 2020 @5:00 PM

Article image
The Guardian view on computers and language: reproducing bias | Editorial
Editorial: The English language is full of value judgments. These are taken over by the computer algorithms that use it. What can we do about these unconscious biases?

Editorial

14, Apr, 2017 @4:49 PM

Article image
'It's able to create knowledge itself': Google unveils AI that learns on its own
In a major breakthrough for artificial intelligence, AlphaGo Zero took just three days to master the ancient Chinese board game of Go ... with no human help

Ian Sample Science editor

18, Oct, 2017 @5:00 PM

Article image
Rise of the racist robots – how AI is learning all our worst impulses
There is a saying in computer science: garbage in, garbage out. When we feed machines data that reflects our prejudices, they mimic them. Does a horrifying future await people forced to live at the mercy of algorithms?

Stephen Buranyi

08, Aug, 2017 @6:00 AM

Article image
Why people believe Covid conspiracy theories: could folklore hold the answer?
Researchers use AI – and witchcraft folklore – to map the coronavirus conspiracy theories that have sprung up

Anna Leach and Miles Probyn

26, Oct, 2021 @7:00 AM

Article image
Don't dismiss tech solutions to mental health problems
There is a desperate shortage of skilled clinicians to treat mental health disorders. Our study shows how virtual reality could fill the gap

Daniel Freeman and Jason Freeman

11, Jul, 2018 @10:30 PM

Article image
Hear, boy? Pet translators will be on sale soon, Amazon says
Retailer backs futurologist’s claim that devices conversing in canine will be available in, ruffly speaking, a decade

Sarah Butler and Hannah Devlin

22, Jul, 2017 @7:00 AM